{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prepare the Dataset for Building a Predictive Model\n",
    "\n",
    "As a first step we will build a graph convolution model predict ERK2 activity.  We will train the model to distinguish a set of ERK2 active compounds from a set of decoy compounds.  The active and decoy compounds are derived from the DUD-E database. In order to generate the best model, we would like to decoys with property distributions similar to those of our active compounds.  Let's say this was not the case and the inactive compounds had lower molecular weight than the active compounds.  In this case our classifer may be trained to simply separate low molecular compounds from \n",
    "high molecular weight compounds.  This classifer will have very limited utility in preactice. \n",
    "\n",
    "As a first step, we will examine a few calculated properties of our active and decoy molecules.  In order to build a reliable model, we need to ensure that the properties of the active molecules are similar to those of the decoy molecules. \n",
    "\n",
    "First lets import the libraries we will need. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "from rdkit import Chem\n",
    "from rdkit.Chem import Draw\n",
    "from rdkit.Chem.Draw import IPythonConsole\n",
    "import pandas as pd\n",
    "from rdkit.Chem import PandasTools\n",
    "from rdkit.Chem import Descriptors\n",
    "from rdkit.Chem import rdmolops\n",
    "import seaborn as sns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can read a SMILES file into a Pandas dataframe and add an RDKit molecule to the dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "active_df = pd.read_csv(\"mk01/actives_final.ism\",header=None,sep=\" \")\n",
    "active_rows,active_cols = active_df.shape\n",
    "active_df.columns = [\"SMILES\",\"ID\",\"ChEMBL_ID\"]\n",
    "active_df[\"label\"] = [\"Active\"]*active_rows\n",
    "PandasTools.AddMoleculeColumnToFrame(active_df,\"SMILES\",\"Mol\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's define a function to add caculated properties to a dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def add_property_columns_to_df(df_in):\n",
    "    df_in[\"mw\"] = [Descriptors.MolWt(mol) for mol in df_in.Mol]\n",
    "    df_in[\"logP\"] = [Descriptors.MolLogP(mol) for mol in df_in.Mol]\n",
    "    df_in[\"charge\"] = [rdmolops.GetFormalCharge(mol) for mol in df_in.Mol]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With this function in hand, we can calculate the molecular weight, LogP and formal charge of the molecules.  Once we have these properties we can compare the distributions for the active and decoy sets. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "add_property_columns_to_df(active_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at the frist few rows of our dataframe to ensure that it makes sense."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>SMILES</th>\n",
       "      <th>ID</th>\n",
       "      <th>ChEMBL_ID</th>\n",
       "      <th>label</th>\n",
       "      <th>Mol</th>\n",
       "      <th>mw</th>\n",
       "      <th>logP</th>\n",
       "      <th>charge</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Cn1ccnc1Sc2ccc(cc2Cl)Nc3c4cc(c(cc4ncc3C#N)OCCCN5CCOCC5)OC</td>\n",
       "      <td>168691</td>\n",
       "      <td>CHEMBL318804</td>\n",
       "      <td>Active</td>\n",
       "      <td><img src=\"\" alt=\"Mol\"/></td>\n",
       "      <td>565.099</td>\n",
       "      <td>5.49788</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>C[C@@]12[C@@H]([C@@H](CC(O1)n3c4ccccc4c5c3c6n2c7ccccc7c6c8c5C(=O)NC8)NC)OC</td>\n",
       "      <td>86358</td>\n",
       "      <td>CHEMBL162</td>\n",
       "      <td>Active</td>\n",
       "      <td><img src=\"\" alt=\"Mol\"/></td>\n",
       "      <td>466.541</td>\n",
       "      <td>4.35400</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3)Cl)Nc4cccc5c4OC(O5)(F)F</td>\n",
       "      <td>575087</td>\n",
       "      <td>CHEMBL576683</td>\n",
       "      <td>Active</td>\n",
       "      <td><img src=\"\" alt=\"Mol\"/></td>\n",
       "      <td>527.915</td>\n",
       "      <td>4.96202</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3)Cl)Nc4cccc5c4OCO5</td>\n",
       "      <td>575065</td>\n",
       "      <td>CHEMBL571484</td>\n",
       "      <td>Active</td>\n",
       "      <td><img src=\"\" alt=\"Mol\"/></td>\n",
       "      <td>491.935</td>\n",
       "      <td>4.36922</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3)Cl)Nc4cccc5c4CCC5</td>\n",
       "      <td>575047</td>\n",
       "      <td>CHEMBL568937</td>\n",
       "      <td>Active</td>\n",
       "      <td><img src=\"\" alt=\"Mol\"/></td>\n",
       "      <td>487.991</td>\n",
       "      <td>5.12922</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                              SMILES      ID     ChEMBL_ID  \\\n",
       "0  Cn1ccnc1Sc2ccc(cc2Cl)Nc3c4cc(c(cc4ncc3C#N)OCCC...  168691  CHEMBL318804   \n",
       "1  C[C@@]12[C@@H]([C@@H](CC(O1)n3c4ccccc4c5c3c6n2...   86358     CHEMBL162   \n",
       "2  Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3...  575087  CHEMBL576683   \n",
       "3  Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3...  575065  CHEMBL571484   \n",
       "4  Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3...  575047  CHEMBL568937   \n",
       "\n",
       "    label                                                Mol       mw  \\\n",
       "0  Active  <img src=\"...  565.099   \n",
       "1  Active  <img src=\"...  466.541   \n",
       "2  Active  <img src=\"...  527.915   \n",
       "3  Active  <img src=\"...  491.935   \n",
       "4  Active  <img src=\"...  487.991   \n",
       "\n",
       "      logP  charge  \n",
       "0  5.49788       0  \n",
       "1  4.35400       0  \n",
       "2  4.96202       0  \n",
       "3  4.36922       0  \n",
       "4  5.12922       0  "
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "active_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's do the same thing with the decoy molecules"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "decoy_df = pd.read_csv(\"mk01/decoys_final.ism\",header=None,sep=\" \")\n",
    "decoy_df.columns = [\"SMILES\",\"ID\"]\n",
    "decoy_rows, decoy_cols = decoy_df.shape\n",
    "decoy_df[\"label\"] = [\"Decoy\"]*decoy_rows\n",
    "PandasTools.AddMoleculeColumnToFrame(decoy_df,\"SMILES\",\"Mol\")\n",
    "add_property_columns_to_df(decoy_df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "tmp_df = active_df.append(decoy_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With properties calculated for both the active and the decoy sets, we can compare the properties of the two compound sets. To do the comparison, we will use violin plots.  A violin plot can be thought of as analogous to a boxplot.  The violin plot provides a mirrored, horizontal view of a frequency distribution.  Ideally, we would \n",
    "like to see similar distributions for the active and decoy sets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x7f6a94f7ab38>"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEGCAYAAACKB4k+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3dd3xT9f7H8dcno+keQFtWy94gG0HxgnsP3PN61eu4ctWrXvfgugfO6716VdSfynVeByI4ENDrQi3idbB3y+ykI6NN8v390VCLIrTQ5GR8no8HjyQnJ8mnJc0753yXGGNQSimlAGxWF6CUUip6aCgopZRqoqGglFKqiYaCUkqpJhoKSimlmjisLmBvdOjQwXTv3t3qMpRSKqYsXLiwzBiTu7P7YjoUunfvTlFRkdVlKKVUTBGRdb91n54+Ukop1URDQSmlVBMNBaWUUk00FJRSSjXRUFBKKdVEQ0EppVQTDQWllFJNNBRUk2AwaHUJSimLaSgoAN577z0OOugg3nrrLatLUUpZSENBAbBy5UoAVqxYYXElSikraSgoAGpra3e4VEolJg0FBUB1dfUOl0qpxKShoACoqqoCoKKy0uJKlFJW0lBQAJSVVwBQUaGhoFQiC2soiEi2iPxHRJaKyBIRGSci7URkjoisCF3mNNv/BhFZKSLLROTwcNamfhYMBqmsbAyF2ppq6uvrLa5IKWWVcB8pPAq8b4zpDwwFlgDXA3ONMX2AuaHbiMhA4HRgEHAE8LiI2MNcn6Lx1JG/oYFAansASktLLa5IKWWVsIWCiGQCvwOeATDG1BtjqoDjgedDuz0PnBC6fjzwijHGZ4xZA6wExoSrPvWzTZs2ARDI6LjDbaVU4gnnkUJPoBR4TkQWicg0EUkD8o0xmwBCl3mh/bsAxc0eXxLatgMRuUhEikSkSL/Rto2NGzcC4M/qCsCGDRusLEcpZaFwhoIDGAE8YYwZDtQROlX0G2Qn28yvNhjzlDFmlDFmVG7uTpcYVa20fv16ECGQkY/YHRQXF+/+QUpF0KZNm6jUnnEREc5QKAFKjDFfhW7/h8aQ2CIinQBCl1ub7V/Q7PFdgY1hrE+FrF27FpIzweYgmJzNmjVrrS5JqR2cd94fuO6666wuIyGELRSMMZuBYhHpF9p0MLAYeAc4N7TtXGBG6Po7wOki4hKRHkAf4Otw1ad+tmz5ChqSGzuB+VNyWL5iBcb86iBNKct4vT6WL19udRkJwRHm578M+LeIJAGrgfNoDKLXROQCYD1wCoAx5icReY3G4PADk40xgTDXl/Cqq6vZumUzwa4jAQiktqembAVbt24lPz/f4uqUUpEW1lAwxnwHjNrJXQf/xv53AXeFsya1o59++gmAQFpje38gPa9pu4aCUolHRzQnuB9++AHERiCtAwDBlHaI3cH3339vcWVKKStoKCS4b4qKCKTngt3ZuMFmoyE9n6KihdYWppSyhIZCAqusrGTlihX4MzrvsN2f2YWSkmIdxKZUAtJQSGBffPEFxhj8OYU7bPdnN/YM/vzzz60oSyllIQ2FBDZv/nxIziCY0m6H7SY5C5Pajrnz5llUmVI/CwS0E2IkaSgkqLKyMr5duBBfu14gvx5MXt+uJ0sWL6akpMSC6pT6mc/ns7qEhKKhkKDeffddjDE0tO+90/sb2vcGEWbOnBnhypTakcfjsbqEhKKhkIB8Ph9vvf02/qyumOTMne5jklJpyO7OzHffxe12R7hCpX725JNPNl0PBoMWVpIYNBQS0KxZs9hWVUV9x8FN21zrF+Bav2CH/eo7DsZdV8dbb70V6RKVarJ69eqm67qGePhpKCQYt9vNCy9OJ5DRkUBGp6btNncFNnfFDvsG03PxZ3XlpZdfoaamJtKlKgVAQ0ND0/WKiopd7KnagoZCgpk+fTpVlRV4u4zcaQPzL/m6jqSurpZnn302AtUp9WvNG5p1Wvfw01BIIKtWreLVV1+loX1vghktm9comNqe+tz+vP322yxevDjMFSr1ax6PB7s0fodZsWKF1eXEPQ2FBOHz+bjjjjsJ2l34Cka37rFdRmKS0rjjzru0J4iKqIaGBmpqqkl3BumT5WfBgi+tLinuaSgkiH/84x+sXbuGum77Y5wprXuwIwl39wPYtHEDDzzwgK61oCLm888/Jxg0ZDiDjOjgY+XKVXq0EGYaCglgxowZzJw5E1+nfQhkF+z+ATsRyOyEr8tI5s6dy6uvvtrGFSr1a4FAgOeemUaSzZDmNEzo5CPVCc9p+1ZYaSjEuU8//ZRHHnkEf1YB9V1G7NVz1Xfah4acHvzrySeZM2dOG1Wo1M5NmzaNdcUl5KYEECDNaTi6oI4vvvyS9957z+ry4paGQhz75ptv+NtttxFIy8XTayLIXv53i+DteQCBjI7cc889fPrpp21Sp1K/NGfOHF5++WUO6uIlw/nz6cqjCr0MatfAQw8+oGt+hImGQpz69NNPuf6GG/AnZVLX59Cf10vYWzYH7t6H4E9tz5Qpf2Pu3Llt87xKAcYYXnvtNe6++y765/g5u0/dDvfbbTB5UA3tXX6u+evVfPLJJxZVGr80FOLQe++9x5QpU2hIbkdtvyPB4WrbF7A7qet7OA1pudxx5528+eabbfv8KiG53W4eeOABHn/8cUZ18HH1Pttw7OQTKt1puHl4JQUpXqZMmcLzzz+P3++PfMFxSkMhjgSDQZ588knuu+8+GtI7Utf38LYPhO3sSbj7HoY/q4C///3vPProo/qHqfbYN998w3l/OJdZs2ZxbDc3kwfX4rL/9v6ZSYbrh1WxX76P5557jksuuVh7JbURh9UFqLZRU1PD3XffzZdffkl9bn98hWPBFubMtznw9D4IV3ERb731FuvWr+fWW24hOzs7vK+r4saWLVuYNm0ac+bMoVOa4aYR1fTLbtmXiyQ7XDKolpG59bywYhUXX3wxJ598MmeffTaZmTuf6FHtnh4pxIFly5ZxwR8v5MsFX+EtHIuv27jwB8J2YsNXOAZP9/EsWvQd519wAT/++GNkXlvFrG3btvH4449z9tln8fHcORzXzc0doypaHAjNjc6r554xFRyQ7+b1117jzDNO56WXXtJ1GPaQxPJApFGjRpmioiKry7BMMBjkjTfe4F9PPknAnkxdz4kE0/P26LlSls4GwNP/qD2ux1ZXTtrq+djq6zj//PM444wzsNt3cQ5AJZzq6mreeOMN/vP6a7g9HsZ39HJiDw/tk397Suy7v2381n/jiN3PkFpca+f11al8V5ZE+3Y5nHnW2RxzzDG4XGE6jRqjRGShMWbUTu/TUIhNpaWl3H33PSxa9C3+7EI8PcaDI3mPn68tQgEAfz3Jaz/HWbmGQYMHc/NNN9GpU6fdP07FtcrKSl5//XXefutN3B4vIzvUc1JPN13Td7/UZmtCYbullQ7eWJPGsioHOdlZnHb6GRx33HGkpqbu8c8QTzQU4ogxhg8//JBH//53PN56PAVjaOjQt0Uznu5Km4VCY5E4yleRWrwAl9PO5Esv5ZhjjkH2skYVe7Zu3cqrr77Ku+/OpN5Xz5g8H8d191DQgjDYbk9CYbullQ5mrEvlpwonmRnpnHzKqUyaNImMjIxWP1c82VUoaENzDCkrK+OBBx5gwYIFBDPycQ8cj0nOsrqsXxPB36E3NRn5+Nd+xoMPPsj8+fO59tpr6dixo9XVqQgoKSnhpZde4sMPPsAEA4zL93JsNw+d0iK7clr/HD/9c6pZuc3BzHX1PPvss7z80r85/oRJnHLKKbRv3z6i9cQCPVKIAcFgkFmzZvH4E0/g9dXj6TyShvwBez9CuZk2PVJozhicpctIKfmGJIediy++iOOPP17bGuLUunXreOGFF5g/bx52m2FCJw9HFXjpkLLnYbA3Rwq/tL7WzrvrUvhqqwuHw8Fxxx3PGWecQYcOHfb6uWOJnj6KYcXFxUyd+gDff/8/Apmd8HTb/zfXVd4bYQuFEPHVkrLuC+zbSug/YADXXnMNPXv2DMtrqcjbHgbz5s0lyQYHd/FwZKGHrKS9/3xpy1DYbovbxrvrUvh0czJ2h4Njjz2OM888M2HCQUMhBjU0NPDyyy/zwgsvEsCGp+toGjr02eu2g98S7lAAGtsaKlaTWvwVtmADZ555Jmeffbb2DIlhZWVlPPfcc7w3ezZOOxzS2c2RhR4y2yAMtgtHKGy31WNj5toUPtucjMPp5JRTT+P0008nPT29zV8rmmgoxJiffvqJ++6/n/Xr1tGQ0wNf4b6YpPD2mohIKIRIgwdX8dc4y1fRuXMXrr32GoYNGxb211Vtx+128/LLL/Paq6/g9zdwSBcPx3Zr2zDYLpyhsF2px8Ybq1P5YouLzIx0zv3DeRx//PE4HPHZ7KqhECPcbjdPP/00b739NiSl4S4cSyC7MCKvHclQ2M6+bQOp678EbzVHH300l1xyScL3Col2xhg+/vhj/vHY3ymvqGRsvo+Te7rJ24s2g92JRChst7bGzisr01lc6aB7t0L+cuVVcfmFRXsfxYCvvvqK+++fSnl5GfV5A/F1Hdl2M5tGqUBWF2oGnoBr4yJmzZ7N5198wdVXXcUBBxxgdWlqJ0pKSnjk4YcoWvgt3TKCTB5ZQ++s+JrvqntGgOuGbWNRmZPpK9fxl7/8hUMPPZRLL72UnJwcq8uLCA0Fi9XU1PDYY4/x4YcfYlJzcA84Zo9HJcckuwNfwWga2vXArPucW265hQMPPJArrrhC51CKEoFAgDfeeINpTz+NnQbO6VvHwV282OJ02IkIjMhtYFC7CmauTWH23Dl8/dUCLr/iLxx00EFxP94mrBPkiMhaEflBRL4TkaLQtnYiMkdEVoQuc5rtf4OIrBSRZSJyeDhriwZfffUV5/z+XD6cMwdfp6HUDjgusQKhmWBaB2r7H4uvywjmf/IJ5/z+97qITxQoLi7mssv+zOOPP87ArDruHVPBoV3jNxCac9nh5F4e7hhdRXuquOOOO7jl5puprKy0urSwisSsaQcaY4Y1O391PTDXGNMHmBu6jYgMBE4HBgFHAI+LSFx2Zvd4PDz88MNcd911VPqgbsCx1HcdCba4/HFbzmajvvMw6gYcT3UgiVtuuYV7772Xurq63T9WtSljDDNmzOCPf7yAdSuWcPHAGq4cUkOOK3bbIPdUl7QAt4yo4rRedXz15eec94dz+eKLL6wuK2ysmCX1eOD50PXngROabX/FGOMzxqwBVgJjLKgvrFavXs2FF13EjBkzqM8fTO2AYwimJUbf6JYKpuZQ2/9ofJ2G8v4HH3D+BRewbNkyq8tKGFVVVdx4ww08/PDD9Emr464xlezfsT5cvaFjgt0GR3fzctuoKjKDVdx44408+OCDeL1eq0trc+EOBQN8KCILReSi0LZ8Y8wmgNDl9vMlXYDiZo8tCW3bgYhcJCJFIlJUWloaxtLb3uzZs7n4kkvYsKUcd78j8BWOAZs26+yUzU5915G4+x3FlspaLp08mbfeeotY7i0XC7799lsuOO8PFH29gLP61PHXodW0c0V2aopo1jU9wJSRlRxZ6GHmzJlccvFFrF692uqy2lS4Q2F/Y8wI4Ehgsoj8bhf77ux7yK8+AYwxTxljRhljRuXm5rZVnWHl9/t56KGHuP/++/Emt6dm4PEEMjtbXVZMCGTkUzPwOHxpHXn00Ue59957qa+vt7qsuBMIBHj22We5+uqrSKqvZMrIKg4vSIy2g9Zy2uCM3m6uHVZN5eb1XHLxxcyaNcvqstpMWEPBGLMxdLkVeIvG00FbRKQTQOhya2j3EqCg2cO7AhvDWV8k1NTUcM011/LOO+9Q33EI7r6HY5wpVpcVWxzJePociq/zMD744AP+cuWVVFVVWV1V3Kiurub666/jhRdeYHy+l9tHVVCY0fJZTBPV4HYN3Dm6gr4ZbqZOncrUqVPjYmGfsIWCiKSJSMb268BhwI/AO8C5od3OBWaErr8DnC4iLhHpAfQBvg5XfZFQXl7OZZddzqL/fYenxwH4Cka36SR2CUWE+i4j8PSayJIly7h08mS2bt26+8epXVq3bh0XXfhHFi0s4rx+tfxxQN0u10ZWO8pKMvx1aDXHdnMza9YsLr/8MioqKqwua6+E8xMqH/hMRP5H44f7LGPM+8C9wKEisgI4NHQbY8xPwGvAYuB9YLIxJma/rmzZsoU///ky1hUX4+59KP4OfawuKS742/Wkru9hbNpSyqWTJ7NhwwarS4pZP/zwA3+efCmeqq3cNHwbB3bxJXRj8p6yCZzSy8MVQ6pZs3I5ky/9EyUlJVaXtcfCFgrGmNXGmKGhf4OMMXeFtpcbYw42xvQJXVY0e8xdxphexph+xpj3wlVbuFVVVXHlVVexubSM2j6HE8j6VXu52guBjI7U9j2C8qoa/nLlVZSVlVldUsxZtGgRV191FWmmlltHVNIrzkYmW2FkbgM3DNtGbcUWJl/6J9atW2d1SXtEz2W0MY/HwzXXXsumTVuo630IwYx8q0uKS8G0DtT2OYyy8gqu/utfqa2ttbqkmLFixQpuuvEGcl0+bhleSW4Y5y1KNL2y/Nw8ohLjq+Gav15NrPWQBA2FNmWMYerUqaxYsYK6XhMJZOgqY+EUTOtAXe+DWbduPXfdfTfBoH647U5NTQ03XH8dKXi5Zp9tZIRhVtNE1yk1yF/3qaK6soybb76JQCC2zoJrKLSht956i3nz5uHrMiJis5smukBmZ7wFY/jyiy946aWXrC4n6j311FOUV1Rw2aAq2iVriIZL94wA5/erYdmy5bz55ptWl9MqGgptZN26dTzxxBP4swuo77iP1eW0imv9AuzucuzuclKWzsa1foHVJbVKQ94AGtr14Nlnn2PFihVWlxO1iouLmTlzJod19dAzM7a+vcaiffPqGdq+gWefmYbb7ba6nBbTUGgDwWCQe++7Dz92vN33D9vqaOFic1cggQYk0ICjZjM2d4x1qRPB220/gg4Xd99zD36/NpruzEcffYQARxV6rC4lIYjA0YVuPF5fTM2VpKHQBubMmcOSxYtxdx2NcYZ3hTT1GxwuPAVjWbN6Ne+++67V1USlBV9+Qd9sf0JOameVvtl+sl2wYEHsHH1rKOwlr9fLE/96kmB6Lv72va0uJ6H5c7oRyOzE09Oe0d5IO7FlyxY6pepRVCTZBPJTGti6ZYvVpbSYhsJeevvtt6mqrMDbdXTMnTaKOyJ4u46mrraG119/3epqok5tbR1pDj1KiLR0h6GmJvxLibYVDYW94PV6+fdLLxHI7KzdT6NEMK0DDTndePW116ipqbG6nKiSlpqCJ6BfXCLN7RfSY2jtcQ2FvTBr1ixqqqvxdR5udSmqmfrOw/B6PMyYMWP3OyeQnHbtqPLF1p/89OWprKuxs67Gzt3fZjJ9eey12VU1OMjJaWd1GS0WW++QKOL3+3np5VcIZnQkoKOWo0owtT3+rK689trrcTFrZVvpWlDIFq/T6jJaZX2tA0/AhidgY2mVk/W1sbX+SCAIWz1Cly6xM9WNhsIemjdvHuVlpXg7DrG6FLUT9R2HUF29jffff9/qUqJGly5d2OoRdJ2iyKn02QgE0VCId8YYXnr5FUxqDoGsrlaXo3YikNGRYHour7z6asxNMxAuHTt2pCEA2+q1XSFSyryNH7EdO8ZOm6OGwh4oKipi7ZrVePMHa4+jaCWCL38wmzZu5LPPPrO6mqiQk5MDQG2D/tlHSk3od739dx8L9N2xB/790kvgSsPfrqfVpahd8Od0g+RMXnrpZV3bGUhKSgKgXqc8ipjtv2uXy2VtIa2godBKS5Ys4btFi/DmDQSbLlEV1cSGN38wy5YtZdGiRVZXY7nta1s79a8+Yrb/rmNpXXF9e7TSi9OnI04XDbn9rS5FtUBDh96QlMqLL063uhTLbdu2DYB0px4qREpG6HddWVlpcSUtp6HQCqtXr+aLzz/HmzsA7LHVtS9h2Rx48waxaNG3LF682OpqLLVhwwaS7I3rCqvIyA1NT75x40aLK2k5DYVWmD59OmJ3Up8/yOpSVCs05PVHnMm8+OKLVpdiqTVr1tApNYhN+0ZETLvkIC6HsHbtWqtLaTENhRYqKSlh/vz5eHP7gyN2Go0UYHfizRvIl19+ycqVK62uxhLGGFYsW0phWuyc244HNoGCtAaWL1tmdSktpqHQQi+//DKIjYaOepQQi+rzBiB2Z8KuzlZaWkpVdQ3ddXGdiOue0cCKFctjZryMhkILlJeX8/4HH+Br30fXS4hVDhe+3P7Mnz+fTZs2WV1NxC1duhSAnhk6dXak9czw4/XVs379eqtLaRENhRZ4++23Cfj91OtRQkyrzx+IQXjjjTesLiXilixZgt0GBekaCpHWI3R0tj2Yo52Gwm74/X5mvDMTf3YBJjnL6nLUXjBJaTTkdGfWrNl4vV6ry4moH77/nu4ZAZJ0aE3EdUoNkJYEP/zwg9WltIiGwm58+eWXVG+rol7HJcSFhtx+eDxu/vvf/1pdSsTU1taydOlS+mVpI7MVbAJ9M+tZWPRNTIys11DYjY8++ghJSiWQFTuzHLZaoJ7k5GROPvlkkpOTIRC/Hx6BjI6QnMlHH31kdSkR88UXX+APBBiZG7//r9FuVG49W7aWsiwGeiFpKOyCz+djwVdf4csqAInfX5X46znmmGP485//zNFHH4344/jDQ4T6rEIWfvstdXV1VlcTdsFgkNdee5W8VEOvTG1PsMqIDvW47PDKK69YXcpuxe8nXRtYvnw5Pq837qfHNo4k3n33XR577DFmzZqFcSRZXVJY+bO7EvD7+emnn6wuJew+/PBDVq5cxYnda3XQmoXSnIYjCtx8/PHH/Pjjj1aXs0saCruwfVqEQHqexZWEmT0Jr9fLG2+80dgAa4/vUAik5YJI3IfCkiVLePihB+mTHWBsfhwf/cWIowo9tE+BW2+5mS1btlhdzm/SUNiFkpISxJmMcaZYXYpqS3Yn4kpnw4YNVlcSNitWrOCG668j01HPFYO36VFCFEhxwNVDqvDWVnHtNX+N2mDQUNiFsrIygklpVpehwsDvSKW0rMzqMsLiww8/ZPKllyK+aq4eUkWmToAXNbqmB7hi8Da2bizmogv/GJVTumso7EIgEMDEcQNzIjNii5lpB1qqpqaGqVOncvfdd9MjzcPtoyronKbTZEebATl+poysJC1QzdVXX81zzz2Hz+ezuqwm+om3Cw6HAzHx9cGhGokJ4nQ4rC6jTRhjmDt3Lr8/52xmz57FUYUerhu2TafIjmKd04JMGVnJ2FwPzz//PBecf17UHDWEPRRExC4ii0Tk3dDtdiIyR0RWhC5zmu17g4isFJFlInJ4uGvbnby8PGy+WoiBASeqdRwNteTlxX4HglWrVnH11Vdxxx13kBOs4LZRVZze241Dv+5FvRSH4ZJBtVwztJr6yg1ceeWV3H77bWzevNnSuiLxVekKYAmQGbp9PTDXGHOviFwfun2diAwETgcGAZ2Bj0SkrzHWfVUvLCzE+OuR+jqMK92qMlQbkwYvxldHYWGh1aXssYqKCp555hnemz2bVKfh931rOaiLTxuUY9CQ9g3cnV3BzHUpzP5kPp99+hmnnnYaZ555JqmpkZ+AM6zfJ0SkK3A0MK3Z5uOB50PXnwdOaLb9FWOMzxizBlgJjAlnfbszYsQIABzV8dtLJRHZQ/+f2/9/Y4nf7+f111/n7LPO5P3Zszi0q5up+1ZwSNf4DASPX3YYbe/xx+EPCSTZ4aSeHu7bt5KR7WqZPn0655x1JnPnzo341BgtCgUROUhE9iSyHgGuBZq3duUbYzYBhC63H8N3AYqb7VcS2maZbt26kZffEWfFaivLUG3MWbGGzKxs+vbta3UprfLdd9/xxwvO55///Ce9U2u4e99KzurjJs0Zv6c33X7ZYbS9O05DYbsOyUH+NKiWW0duIzNQzh133MFfrriC1asj9xnU0iOFPwDficiXInK/iBzbvC1gZ0TkGGCrMWZhC19jZ//bv3q3i8hFIlIkIkWlpaUtfOo9IyIcf9yx2Ks3YfPEzsLb6reJrwbHtmKOO/YY7PbYmDI0EAgwbdo0/vKXv1C3dR1XDKnm6n2q6ZQa/z2LUh1mh9H2qY74DcDmemf5mTKyij/0q2XV0u+5+KILmTlzZkSOGloUCsaY3xtj+gIn0fgN/p/A7j6R9weOE5G1wCvAQSIyHdgiIp0AQpdbQ/uXAAXNHt8V+NVq18aYp4wxo4wxo3Jzc1tS/l45+uijcbmSSdoQHT0D1N5xbfwOu93OcccdZ3UpLbJt2zauveYapk+fzoROXu4eU8HI3AYkvr8wN0lxmB1G26ckSChA4+yqB3Xxcd+YCvpnennwwQe57777wt59taWnj84WkSeB/wCHAP8ADtjVY4wxNxhjuhpjutPYgDzPGHM28A5wbmi3c4EZoevvAKeLiEtEegB9gK9b+fO0uezsbM4443SclWux11jbK0DtHVtdGc6yFZxy8skx0fPIGMPtt93G9/9bxAX9a7lgQB2u2Di4UW0oI8lw9dBqTuju5v333+eJJ54I6+u19PTRI8Aw4GngcmPM/caYL/fwNe8FDhWRFcChodsYY34CXgMWA+8Dk63sedTcaaedRl5ePqlrP4NAg9XlqD0R9JO69lOyc9px1llnWV1Ni8yaNYuF337LWb1rmNA5egY3qcizCZzY08NhXT28/fbbfPfdd+F7rZbsZIzpAJwPJAN3icjXIvJiS1/EGPOxMeaY0PVyY8zBxpg+ocuKZvvdZYzpZYzpZ4x5r5U/S9ikpKRw0003gq+G5HVf6riFGOQq/gZxV3LjDdeTkZFhdTktMmfOHLqkBZiogaBCTunlJtkBc+fODdtrtPT0USZQCHQDugNZ7KQROJ4NHTqUc3//e5zlK3FuWWx1OaoVnKXLSNq6hFNPPZUxYyzt5dwq6enpiNjisqup2jNOGzQEG98b4dLS00efAccC/wNODX2T/33YqopS5557LuPHjye55GsclWutLke1gH1bCcnrv2TU6NFcdNFFVpfTKn379qWkVvh8c3xPZa5a7s01KQSC0KdPn7C9RktD4QKgE41jDt4Tkd5T3gwAACAASURBVO9F5PuwVRWlbDYbN954IwMGDCB19SfYt5VYXZLaBXv1JtJWzqNnjx5MufVWHDE219GZZ57JsKFDeWZpBovKnFaXoyxkDMwpSeadtakceeSRHHjggWF7rZaGwnTgWeBEGo8Ytv9LOKmpqdx/33306NGdtJXzsFcV7+4hygL26o2krfyILl0689CDD8ZMO0JzTqeT2++4g8Ju3Xn4+0yeWZIWtyN61W+r8Nl46PtMXlyexpjRo7nqqquQMPZJbmkolBpjZhpj1hhj1m3/F7aqolxGRgYPPfggvXv3JHXlXBzlq6wuSTXjqFxH6oo5FBZ04dFHHiY7O9vqkvZYZmYm/3ryKc444ww+3ZzCjd+044vNSQQTqkUvMdUH4IPiZG78OoelNalcfvnl3HvffTid4T1qbOnx9BQRmQbMBZq6Qhhj3gxLVTEgOzubRx5+mBtuuJHvv/8EX30d9R2HkDCjiqKUc8tikou/ol/ffkydej+ZmZm7f1CUS0pK4uKLL2b8+PE89OAD/GvxGmauDzKpex2jcuu1ITrONAThk40uZq5Pp9ILI4YP56qrr6Zr18isFd/SUDgP6A84+XkeIwMkbCgApKWlMXXq/dx7733Mnz8P8Vbj6zYObDrCKOJMENf6r0naupj99t+fW26+mZSU+FpGddCgQTw97Rn++9//8twz0/jHjyUUpAc5osDNuHyfTpcd4zx+4ZONLj7YkEa5B4YMHsStF/yR4cOHR7SOlobCUGPMkLBWEqNcLhe33HIzXbt24cUXX8Thq8Ld8yBMUuSnvN1TwdR2GHc5AIHU9gRT21lcUetIg4eU1R9jr97EqaeeysUXXxwz8xq1ls1mY+LEiRxwwAHMmzePl/49naeXrOP1Nekc0tnNQV28pMfxBHnxqNxr48OSZD7elIKnAfYZMpjrf38uo0aNCmvbwW9paSgsEJGBxhjtoL8TNpuNCy64gF69enHPPfdiX/IOdT0nEsjoaHVpLeIrHIvN3TiG0NP/KIuraR1bbSlpq+fjDPq45sYbOeyww6wuKSLsdjuHHnoohxxyCEVFRbz66iv8p2ghM9elsV++h0O7eumaHhUTAqidMAZWbHMwpySZb0pdII1hf+qpp9K/f39La2tpKIwHzhWRNTS2KQhgjDH7hK2yGDRx4kQKCwu56eZb2LTsPXxdRmo7Q7gYg3PrYpJLviG3QwfuuvOhmJsKuy2ICKNHj2b06NGsXr2a119/nbkffcT8jckMbOfnsC5uhnVo0HaHKFEfgK+2upizIZW11TbS01I5+ZRjOOmkk8jPz7e6PACkJVOxiki3nW23ugfSqFGjTFFRkZUl7FRdXR33338/n3zyCf6srnh7HIBxRvf57ZSls4EYOVLw+0hZ+xmOynWMGzeOG264IS4alNtKVVUVs2bN4u233qS0rJzcVMOhnd38rrMv5qaevvvbTJZW/dzbpn92AzeOqLawoj1T5RPmbkhm/qZUqn3QrbCAk04+hUMPPdSSti8RWWiMGbXT+yK9qk9bitZQgMYZLt9++23+8c9/ErQlUdfjdwQyO1td1m+KlVCw12whdc0n2Bo8XHTRhZx22mmWnHeNBX6/n88++4w33/gP3//wI8kOYXxHN4d19dIxRtZiiPVQWF1t58PiFL4qdRE0MG7sOE46+WRGjBhh6ft2V6EQW0M8Y4iIMGnSJIYMGcKUv/2NDcvex9dpH+o7jwCbdhNpNRMkadP3uDYuIj8/n79NmcqAAQOsriqqORwOJk6cyMSJE1m+fDn/+c9/mD9vLnNLUhjRoZ4jCz30yfLr2c02FjTwv3Ins9ensqzKQWpKMidMOppJkyZFrFvp3tAjhQjweDw89thjzJ49m2B6Lu4eEzDJ0XW6I5qPFMRXS+qa/2Kr2czBBx/MVVddRVpamtVlxaTy8nJmzJjBW2++QU1tHb2zAhxZ4GZklI53mL48lU83uQDolhGgMN3P2X3dFle1c/UB+Hyzi/dL0thUJ+TlduCUU0/jqKOOirr3q54+ihIff/wx90+disdbj7twLP72vaOmETpaQ8FRsZrUdV/ictq46sorE6Z3Ubh5PB4++OADXnv1FTZu2kynNMMxhXVROd7h7m8bv0BF62kjXwDmbUjmvZI0qrzQp3dvTj/jDCZMmBC1823p6aMoMXHiRAYOHMgdd97JD99/SsO2Erzd9gOHy+rSok+ggeT1C3CWraBv//7cesstdOnSxeqq4kZKSgonnHACxx57LP/973+Z/uILPL1kDW+tTePogjp+19mHM8rCIdq4/cKc4mQ+2JBKbT0MHzaMs84+m5EjR8Z0O5eGQoTl5eXxyMMP89JLL/Hcc8/hdJdR1/13BDOioztaNLDVlZG25hPwVnPOOedw7rnnRu03rlhnt9s58MADmThxIgsWLODFF1/g+cVLeLc4neMKazmgU/QdOVjN44c5JSm8V5xKXQOMGzuWs885h0GDBlldWpvQvzQL2O12zjnnHEaMGMFtt93O1mWz8XUeTn2nfUAS+C/QGJxbfiJ5QxHtcnK49Z6HGTZsmNVVJQQRYdy4cYwdO5aFCxfyzLRpPLd0Ke+uT+OE7nXs39EXlW0OkVQfgI82JDNrfRo19Y1hcN7558fd+BgNBQsNGjSIZ599hgcffJD58+fjqNmMp+eEqB/TEBZ+LymrP8WxrZj99t+f6669lqysLKurSjgiwqhRoxg5ciRff/01z0x7mqeXrOT94lRO7VnLPu0boqUZLGKCprEB+c216ZR7YPToUZx//gVx2/tNQ8Fi6enp3HrrrYwcOZJHH30Ux+IZUT+moa1tH3tgD3iZfPnlTJo0KabPycYDEWHfffdlzJgxfPzxxzz91JM8+P1mBuT4ObN3Ld0yEmMKjR8rnLy8Mp3iWhv9+vXl5kv+FPEJ6iJNQyEKiAjHHHMMAwcO5NYpUyhZ/gG+ziNCp5Pi+MNx++mikiI6dszntr89SL9+/ayuSjUjIhx44IGMHz+ed999l/977lluLXJwYGcvJ/V0kxGnk++Vemy8vDKNotIkOnXMZ8rVlzBx4sSE+LKSwCewo0/Pnj156sknOXDigbg2LCRl5Vzw11tdVngEGkheNZ/k4q/Zb79xPP3UUxoIUczpdDJp0iSm//slTjzxJD7elMq1C9rxyUYXMdyr/Vf8QZixJoXrv87hh+p0/vjHP/J/z7/AgQcemBCBAHqkEHVSU1O59dZbGDJkMP/4xz9xLJ1JXa+DCabE7uphvyTeatJWzcPmqeTCiy7ijDPOSJg/uFiXkZHBZZddxtFHH80jjzzMM9//wJdbkjm/fw15KbExdcZvWV1t55llmRTX2JgwYQKTJ08mLy/P6rIiTo8UopCIcOKJJ/Lwww+RmSSkL303btaCtldvJGPpTNKlnqlTp3LmmWdqIMSgnj178sgjj3LllVey1pvOTV/nMH9DbB41BILwxuoUbluYjdvRjjvvvJPbbrstIQMBNBSi2tChQ5n29FP07N6N1JUf4dz8EzH5Vxfi3LqU1OUfUNC5I08//RSjRu10QKWKETabjeOPP57/e/4FBg8dwXPL0nn8p3Q8/tgJ+QqvjXu/y2LG2lQOPfQw/u+FFxk/frzVZVlKQyHK5eXl8Y/H/s74/fcnufgrXOsXgImxw3RjcBV/Q/K6Lxg9ejRPPP44nTsnTu+qeJeXl8fUBx7gwgsv5JuyFKYszGGLO/o/WlZsc3BLUQ7rPKnceOON3HjjjWRkZFhdluWi/39OkZKSwu23387pp59O0tYlpKyaD0G/1WW1TDBA8ur/krT5B4477jjuufvuqJscTO09m83GWWedxSOPPILblsHt3+awalv0NlkWlSZx73dZZHboxFNPT9M5tZrRUIgRNpuNSy65hD//+c84qtaTtvxDCER5z6SAv/G0V8UqLrzwQq688kqdriLO7bPPPvzz8SdIb5fPvd9lsTIKg+GrLUk89kMGffr255+PP0FhYaHVJUUVDYUYc/LJJ3PLzTfjqCslbfkH4PdaXdLO+etJW/EBjuqNXHvttZx11lnaoJwgCgoK+OfjT9A+ryMP/ZDFxrro+ZhZXOHgySUZDB48iIcefoTs7Pjp1ddWoud/S7XYwQcfzJ133kGSt4r05R+A32d1STsK1JO24kMcdWXceuutHHVUdE3HrcKvXbt2TH3gQRwpmTz6Yzb1UTAAuson/P2nLAoKu3H3PfeSnJxsdUlRSUMhRu23337cc8/dOH3VpC9/P3qCIdDQGAjucm677W8ceOCBVlekLNKlSxduuvkWNtUJM9ZaP5/XC8vTacDB7XfcqQ3Ku6ChEMNGjx7NXXfdicO3jbSVcyBgceNzMEDqyrk46sr429+mcMABB1hbj7Lc6NGjOeyww5hdnEp1vXWnD1dtc1BUmsQf/nAeBQUFltURCzQUYty+++7Lrbfcgr22lJRV86zrrmoMyas/wV69keuvv57f/e531tShos5ZZ51FIEjTsppW+Hiji2SXi0mTJllWQ6zQUIgDEyZM4KqrrsKxrQTX+q8sqSGppAhn5Vr+9Kc/afc+tYNu3brRv19fFpVbFwqLKpIZf8ABpKamWlZDrAhbKIhIsoh8LSL/E5GfROS20PZ2IjJHRFaELnOaPeYGEVkpIstE5PBw1RaPjj322KZxDM4tiyP62o6yFbhC4xBOPfXUiL62ig19+vZjg9tpyYD86nqh2kfcLYYTLuE8UvABBxljhgLDgCNEZCxwPTDXGNMHmBu6jYgMBE4HBgFHAI+LiD2M9cWdCy+8kLFjx5Jc8jW22q0ReU2bu5zUdV8ybPhwLr/8cu12qnYqLy+PunqD34JQqKq3NdWgdi9soWAa1YZuOkP/DHA88Hxo+/PACaHrxwOvGGN8xpg1wEpgTLjqi0d2u52bbrqJvA65pK3+OPw9kgINpK2eT3Z2JlNuvVUHpqndsuIrg35NaZ2wtimIiF1EvgO2AnOMMV8B+caYTQChy+3x3QVoPhVoSWjbL5/zIhEpEpGi0tLScJYfkzIyMrj99tuwNbhJDnP7gqv4G/DWMOXWW8nJydn9A1TCcrvd2G1gt+AT2mVvPDzxeDyRf/EYFNZQMMYEjDHDgK7AGBEZvIvdd/Z2+dXBpjHmKWPMKGPMqNzc3LYqNa7079+fc845B2f5yrBNuW2v3kRS6VJOPeUUhg0bFpbXUPGjoqKCbJc1CwlmJQWbalC7F5HeR8aYKuBjGtsKtohIJ4DQ5faT3yVA8w7EXYGNkagvHp1zzjkUFBSSWvxV20+eFwySUryAvPyOXHDBBW373CoulZeXk+W0Zlizyw4pTtFQaKFw9j7KFZHs0PUU4BBgKfAOcG5ot3OBGaHr7wCni4hLRHoAfYCvw1VfvHM4HFx55V/AW01SG/dGcpYuRdyVXHH5Zbhc1nUzVLGjoryMrCTr5rrITgpSXl5u2evHknC2DHYCng/1ILIBrxlj3hWRL4HXROQCYD1wCoAx5icReQ1YDPiBycaYKJgxJXaNGDGCMWPG8M2i76nP6w/2pL1/0qCflM3fM2jIEPbbb7+9fz6VEDweNx3t1i0QlWwP4vVG6eSRUSZsoWCM+R4YvpPt5cDBv/GYu4C7wlVTIjr//PP5+pJLSNq6lPpO++z18znLVmDq3Vxw/vna/VS1WCAQQCwMBZsYAgH9jtkSOqI5zvXv35+hQ4fiKl2691NgGINr6xL69O2rjcuqVTIyMnH7rfu4qfM7SE9Pt+z1Y4mGQgI48cQTwVeLvXrv2u3ttVsQTxUnnXiiHiWoVumQm0eZz5pxLEEDFV7o0KGDJa8fazQUEsC4ceNITUvDWb5qr57HUb6apCSXTnanWq1Pnz5sqLXhtWAi35I6O/WBxhrU7mkoJICkpCQmTphA0raSPT+FZAyubesZN26sTiqmWm3IkCEEDSzf5oz4ay+tbHzNwYN3NUxKbaehkCDGjh2L8fuw7+GcSDZ3OaberT2O1B4ZNmwYyS4XC0vboAdcKxWVuujerZDOnTtH/LVjkYZCghg+fDgigr160x493l6zGYCRI0e2ZVkqQbhcLvbbf3++KUumIYJLfpR7bSyrcjBhoq4A2FIaCgkiIyODbt17YK/dskePt9dsIb9jR22sU3vsyCOPpLYevo3g0cKnm1wY4IgjjojYa8Y6DYUEMnjQQJyecvZkUvskbzlD9Jys2gsjR44kPy+XeRsjs15zIAifbE5l5IgRdOrUKSKvGQ80FBJInz59MA0+pL529zs3Iw1ejLeW3r17h6kylQhsNhvHHX8CSyodbKgL/1Ip35U7KffACboEZ6toKCSQnj17AmDzVLXqcTZPJQC9evVq85pUYjn66KNxOh3MKUkO+2vNKUklt0N7xo0bF/bXiicaCgmke/fuANhDH/ItZfM2hki3bt3auiSVYLKzszn44EP4fHMydQ3hGwBZUmtncaWDSSeepIs/tZKGQgLJyMggKzsHm3dbqx5n82zDlZyMrl+h2sJJJ52ELwCfbArfDLsfFCfjSnJy9NFHh+014pWGQoLp3q0b9taGgreKwoICndpCtYk+ffowdJ99mLMhjUAYuqfW1AtfbE3msMOPICsrq+1fIM5pKCSYwsIC7L5treqB5Kyv1lNHqk2dcuqplHugKAzdU+duSKYh0HhEolpPQyHBdOvWrbEHkr+Fc8sHGjDeWgoLC8NbmEoo48aNo0vnTrxXnLonPaR/U30APtqYyr5jxjS1oanW0VBIMNv/UGwtbGze3lOpR48e4SpJJSC73c6pp53O6mo7y7e1XUPwF1tcVPvg1NNOa7PnTDQaCglm+4e7zd2yUNjeU0lDQbW1ww8/nKzMDGavb5vBbEED7xen0ad3L0aMGNEmz5mINBQSTPv27cnMysLuadki5jZ3Ba7kZJ1MTLW55ORkTph0IovKkthUt/cfRf8rd7KxTjj9jDO1U8Re0FBIQP369sXh3nER82BqO4Kp7X61r91dRu9evbHZ9K2i2t4JJ5yA0+ngg5K9P1p4v7hxsNqECRPaoLLEpX/pCah///6IpxICDU3bfIVj8RWO3XHHYBCHp4IBA/pHuEKVKHJyctpkMFtJrZ0lOlitTWgoJKCBAweCMdjryna5n81TgQn4GTRoUIQqU4noxBNPxBeAzzfv+WC2eRtdOJ0OjjrqqDasLDFpKCSg7R/yu5tG216zZYf9lQqHvn370rtXTz7fsmfzITUEYcHWFMaPP4Ds7Ow2ri7xaCgkoMzMTLp17960cM5vsddsJi+/I3l5eRGqTCWqw484kjXVdja5W/+R9GOFk9r6xt5Mau9pKCSoEcOH46zbCsHAzncwhqS6LYwYPiyyhamEdMABBwDwXVnrRzgvKksiNSVZVwVsIxoKCWr48OGYgB97XelO77e5KzANXu3vrSKiY8eO9OjejR/KWx8KP1QmM3LUaJxOZxgqSzwaCgnq5zWbN+70/u3bhw8fHsmyVAIbOmw4K2uSWjVJXpnXRrkHhg3TI9q2oqGQoDIyMujTty/O3wgFR81GuhYU6nTZKmIGDhyI12/Y5G75qmxrqh1Nj1VtQ0MhgY0eNQpbXdkO4xUACAZw1m5hzOhR1hSmEtL2lf3W17Y8FIpr7dhEdBqWNqShkMBGjBgBJvirXkj22q2YgF/bE1REFRYWYrPZ2NiK9Zs31Nnp1DGf5OTwL++ZKDQUEtjgwYNxOJw4fnEKyV6zCRFh6NChFlWmEpHT6aRTx/xWnT7a7HVS2F2PEtqShkICc7lcDBo0EGftjkcKjprN9O7Th4yMDIsqU4mqsFt3Nnl27EVUmO6nMN3/q32DBjbX2SgoKIhUeQlBQyHBDRs2DHFXQKC+cUMwgL2ulOHam0NZoLCwkC1uG8FmC++c3dfN2X3dv9q33GujIYguANXGNBQS3JAhQxrnQaptHK9gqyuDYIDBgwdbXJlKRN27d6chCFs9u/9oKgm1PegKa20rbKEgIgUiMl9ElojITyJyRWh7OxGZIyIrQpc5zR5zg4isFJFlIqJj1iOgf//+jeMVQoPYtk+Sp/MdKSts70VUXLv7mU6376Oh0LbCeaTgB642xgwAxgKTRWQgcD0w1xjTB5gbuk3ovtOBQcARwOMi0vIWJ7VH0tPT6dKla7NQKKVd+w60b9/e4spUIurRowd2u421Nbv/019bY6dzp46kp6dHoLLEEbZQMMZsMsZ8G7peAywBugDHA8+HdnseOCF0/XjgFWOMzxizBlgJjAlXfepn/fr1xeltXHbT6a2kf7++FlekEpXL5aJHjx6sqt71lBXGwOpaF/0H6KC1thaRNgUR6Q4MB74C8o0xm6AxOIDtU3B2AYqbPawktO2Xz3WRiBSJSFFp6c7n7VGt06tXL4y3FmnwgKeqaRCRUlbYZ5+hrKp24t/FdBdlXhsVnlCbmGpTYQ8FEUkH3gD+Yoyp3tWuO9lmfrXBmKeMMaOMMaN0Coa2sf2crKOqGIzRc7TKUkOHDsUXgDU1v92usLjS2bSvalthDQURcdIYCP82xrwZ2rxFRDqF7u8EbA1tLwGadzjuCux8Yh7Vprp27QqAvap4h9tKWWH48OHYRPih/LdPIf1Q4aR9TrZObxEG4ex9JMAzwBJjzEPN7noHODd0/VxgRrPtp4uIS0R6AH2Ar8NVn/pZfn4+AI6aTQB07tzZynJUgsvMzKT/gP78r2Lny3P6g/BjpYsxY8fR+DGj2lI4jxT2B84BDhKR70L/jgLuBQ4VkRXAoaHbGGN+Al4DFgPvA5ONMb+xAoxqSy6Xi6zsHCRQT3Jyio5kVpYbP/4A1lTbKff++iNqWZUDdwPsv//+FlQW/8LZ++gzY4wYY/YxxgwL/ZttjCk3xhxsjOkTuqxo9pi7jDG9jDH9jDHvhas29Wt5uR0AaN+hg8WVKAXjx48HYGHprxfdWVjqwpXkZNQoncU3HHREswLguOOOY9iwYUw64XirS1GKwsJCenTvxjelO55CChooKktm37HjdGbUMNn9sEGVEI499liOPfZYq8tQqsmEiQfy/P+to8onZLsaOyKu2OagygcTJkywuLr4pUcKSqmoNGHCBAxQ1OwU0jdbk3A6HYwbN866wuKchoJSKip1796dgq5d+Las8RSSMfBteQqjR48mNTXV4uril4aCUioqiQj7jz+AJVVOPH6huM5OmQf233+81aXFNQ0FpVTU2nfffQkEYWmVgx8rnE3bVPhoQ7NSKmoNHDiQJKeTxZVONrntFBZ0pYN2mw4rPVJQSkUtl8tF//79WVntZFWNiyH76FxH4aahoJSKav3692fVNgd19YZ+/fpZXU7c01BQSkW15rP26gy+4aehoJSKas1n7dUZfMNPQ0EpFdWar5uSnZ1tYSWJQUNBKRXVcnJymq7bbPqRFW76G1ZKRTWd+C6ydJyCUiqqiQjpaWkUFBbsfme11zQUlFJR7+VXXsHh0I+rSNDfslIq6ulqgJGjbQpKKaWaaCgopZRqoqGglFKqiYaCUkqpJhoKSimlmmgoKKWUaqKhoJRSqokYY6yuYY+JSCmwzuo64kgHoMzqIpTaCX1vtq1uxpjcnd0R06Gg2paIFBljRlldh1K/pO/NyNHTR0oppZpoKCillGqioaCae8rqApT6DfrejBBtU1BKKdVEjxSUUko10VBQSinVREMhTonIJBExItJ/N/v9QUQ6N7s9TUQGhr9ClYhEJCAi34nITyLyPxG5SkT0cyiK6H9G/DoD+Aw4fTf7/QFoCgVjzB+NMYvDWJdKbB5jzDBjzCDgUOAoYIrFNalmNBTikIikA/sDF9AsFETkWhH5IfQN7V4RORkYBfw79O0tRUQ+FpFRIvInEbm/2WP/ICKPha6fLSJfhx7zpIjYI/wjqjhgjNkKXAT8WRrZRWSqiHwjIt+LyMXb9/3leze0bZiILAjt+5aI5IhILxH5ttnj+ojIwsj/dLFLQyE+nQC8b4xZDlSIyAgROTK0fV9jzFDgfmPMf4Ai4KzQtzdPs+f4D3Bis9unAa+KyIDQ9f2NMcOAAHBWBH4mFYeMMatp/BzKo/FLzDZjzGhgNHChiPTY2Xs39PAXgOuMMfsAPwBTjDGrgG0iMiy0z3nA/0XsB4oDukZzfDoDeCR0/ZXQbRvwnDHGDWCMqdjVExhjSkVktYiMBVYA/YDPgcnASOAbEQFIAbaG44dQCUNCl4cB+4SOYAGygD7AIfzivSsiWUC2MeaT0L7PA6+Hrk8DzhORq2j8AjMmAj9D3NBQiDMi0h44CBgsIgawAwZ4I3TZGq8CpwJLgbeMMUYak+B5Y8wNbVi2SlAi0pPGo82tNIbDZcaYD36xzxG07r37Bo3tFPOAhcaY8jYqNyHo6aP4czLwgjGmmzGmuzGmAFgDVADni0gqgIi0C+1fA2T8xnO9SeNh+xk0BgTAXOBkEcnb/jwi0i08P4qKZyKSC/wL+IdpHEX7AfAnEXGG7u8rImnAh/zivWuM2QZUisgBoac7B/gEwBjjDT3XE8BzkfyZ4oEeKcSfM4B7f7HtDWAA8A5QJCL1wGzgRhrPt/5LRDzAuOYPMsZUishiYKAx5uvQtsUicjPwYagrYQONp5R0CnPVEiki8h3gBPzAi8BDofumAd2Bb0NHpKXACcaY90NtBL98755L43s3FVhNY/vBdv+msU3sw/D/SPFFp7lQSsUdEfkrkGWMucXqWmKNHikopeKKiLwF9KKxbU21kh4pKKWUaqINzUoppZpoKCillGqioaCUUqqJhoJSrSAitbu5v7uI/NjK5/y/ZqN4lbKUhoJSSqkmGgpK7QERSReRuSLybWj2zuOb3e0QkedDs3f+p9lI3JEi8omILBSRD0Skk0XlK/WbNBSU2jNeYJIxZgRwIPBgaBQuNE4e+FRo9s5q4NLQ1A2PAScbY0YCzwJ3WVC3Urukg9eU2jMC3C0ivwOCQBcgMegaKgAAAM5JREFUP3RfsTHm89D16cDlwPvAYGBOKDvswKaIVqxUC2goKLVnzgJygZHGmAYRWQskh+775YhQQ2OI/GSMGYdSUUxPHym1Z7KAraFAOBBoPlNsoYhs//DfvizqMiB3+3YRcYrIoIhWrFQLaCgotWf+DYwSkSIajxqWNrtvCXCuiHwPtAOeMMbU0zit+X0i8j/gO2C/CNes1G7p3EdKKaWa6JGCUkqpJhoKSimlmmgoKKWUaqKhoJRSqomGglJKqSYaCkoppZpoKCillGry/7j/gqZN8VxlAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.violinplot(tmp_df[\"label\"],tmp_df[\"mw\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An examination of the distributions in the figures above show that the molecular weight distributions for the two sets\n",
    "are roughly equivalent.  The decoy set has more low molecular weight molecules, but the center of the distribution, show as a box in the middle of each violin plot is in a similar location in both plots. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use violin plots to perform a similar comparison of the LogP distributions.  Again, we can see that the \n",
    "distributions are similar with a few more decoys at the lower end of the distribution. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x7f6a9662ab38>"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEGCAYAAABsLkJ6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXyU5b3//9dnlkz2DYhhSdgX2YUACfuuLEopqFWxtraHfqu1vyP21K+1nvNrtW7t6aI9p4hUaytuRS2b4sa+yCogyioIYU8g+zLr9f1jEgQEgSSTezLzeT4ePEwmM/d8AuP9vq/rvhYxxqCUUir62KwuQCmllDU0AJRSKkppACilVJTSAFBKqSilAaCUUlHKYXUBV6N58+amXbt2VpehlFJNypYtWwqNMS0ufLxJBUC7du3YvHmz1WUopVSTIiKHLva4dgEppVSU0gBQSqkopQGglFJRSgNAKaWilAaAUkpFKQ0ApZSKUpYGgIjcLyKfichOEXlVRGKtrEcppaKJZQEgIq2BnwI5xpiegB34jlX1KKXChy5T3zis7gJyAHEi4gDigWMW16OUstiRI0eYOnUqO3futLqUiGdZABhjjgK/Aw4Dx4ESY8z7Fz5PRGaKyGYR2VxQUNDYZSqlGtmWLVsoLi7mvffes7qUiGdlF1AaMAVoD7QCEkRkxoXPM8bMMcbkGGNyWrT42lIWSiml6sjKLqCxwEFjTIExxgu8BQy2sB6llIoqVgbAYSBXROJFRIAxwC4L61FKhYHg6UBvBDcGK+8BbADmA1uBT2tqmWNVPUqp8FIbBCp0LF0O2hjzX8B/WVmDUio8aQsg9KweBqqUUuepPfFrCyD0NACUUmHF6/VaXULU0ABQSoUVt9ttdQlRQwNAKRVWagMgEAhYXEnk0wBQSoWV2i4gn89ncSWRTwNAKRVW/H7/ef9VoWPpMFBlndrmtYjoaAsVVjQAGo+2AKLQG2+8wejRoxk9ejT/fv8sq8tR6jy1XT86Gij0NACijDGGBQsXEohLxZvWlu3bPiE/P9/qspQ6q7q6GtDRQI1BAyDKbN26laNHjuC+pifu7DwQGwsWLLC6LKXOqqqqAqCyssLiSiKfBkAUCQQCzHn+eYiJx9esAyYmHm96BxYsWMCJEyesLk8pAD777DMAis+csbiSyKcBEEXmz5/Pnt27qWrdH2zB+//u1v3wBQxPPfW0jrtWYaG0tBSAwtOn9TMZYhoAUWLDhg3Mnv0cvrRsfM06nX3cuBKpysrlk0+28pe//EUX4FKWqqysxOv14hCD2+PVlmmIaQBEgQ0bNvDII/+JPy6VqnbD4YJhn97mnfFkdOef//wnzz//vIaAssyBAwcASIkJXvnv3bvXynIingZABDPG8NZbb/HQQw/hdiZS0Xk8OGK+/kQR3NmD8LToyiuvvMKvf/1rKisrG79gFfW2bt0KQJorgMshbNu2zeKKIpsGQIQ6ffo0Dz/8MM888wye5NaUd52IccZd+gUiuNsOxt0mh+UrVnD3D37Ap59+2ngFKwV8vH49sXaDwwbXprj5eP06bZGGkAZAhPH5fLz11lvMmHEn6z/eQHXWIKo6jQW78/IvFsHTsjeVXSdw4kwZ9913H08//TTFxcWhL1xFvfz8fD7ftYskZ7D7Z2CGmxMnT+mFSAhpAESIQCDAhx9+yIw77+SZZ56hPCaNsh5T8Wb2+Fqf/+X4kzIp6zEVT2ZP3nn3XW79znd48cUXKS8vD1H1SsHixYsR+ar/f0CGhzgHLFy40OLKIpelASAiqSIyX0R2i8guEcmzsp6myO12s3jxYmbceSePPfYYx0vcVHYeS2Xn8ZjY5Lof2O7EnTWQih5TKY/L5KWXXuLmm2/hueeeo7CwsOF+AaWA4uJiFvzrXwzKcOOoOSu57DCiZRXLli3jyJEj1hYYoaxuAfwJWGqM6Qb0AXZZXE+Tcfz4cebOncvNt9zK7373O44Wu6nqOJLy7lPwp2Zf9VX/pQTiUqnuNJqK7jdRGnsNr772GrfeeiuPPvooO3bs0P5Z1SDmzZuH2+1mSruq8x6fmF2FXQK88MJfLaosslm2GqiIJAPDge8BGGM8gMeqepoCt9vN+vXreffdd9m4cSMG8KW0wdM1F39SywY76V9MIKE51Z1G4a4uJebkZyxbuZqPPvqItu3aMXnSJMaOHUtaWlrI3l9Frn379vHmm28yolU1rRPOXwE01WWYlFXJv5YtZ8KEiQwYMMCiKiOTWHUFJyJ9gTnA5wSv/rcA/58xpuKC580EZgJkZ2f3P3ToUGOXaqlAIMDOnTv56KOP+PDDj6ioKAdXAu70TnhbdMW4Eq0pzO/FeeYAMQV7sFUUYrPZGDhwINdffz25ubnExX3DiCOlarjdbu6958cUHDnAkwPPkOA0PL412HX5i37BGcEeP/xyczomIYO5f32BpKQkK0tukkRkizEm52uPWxgAOcDHwBBjzAYR+RNQaox55FKvycnJMZs3b260Gq1ijGH37t0sX76cjz5axunThYjNgSc1G2/zzviTW4JY3Xv3FVtVEY7C/cQWHcC4K3C5YhkyZDCjRo1i4MCBuFwuq0tUYeq3v/0tS5Ys4f7epVzXPLj884UBALC/xMFvPklh4MBcfvP449hs4fP5bwouFQBWbghzBDhijNlQ8/184P9aWI+l/H4/O3fuZNWqVaxYsZLTpwvBZsOX3BpvhxH4UrOvbCjnFXId/hgAd3ZuvY8ViEvDkzUAT5v+2MtO4jlzgOWr17Fs2TJcrljy8nIZPnw4ubm5xMfH1/v9VGR48803WbJkCTe2rTx78r+UTik+butYwcsff8zcuXOZOXNmI1UZ2SwLAGPMCRHJF5Guxpg9wBiC3UFRo7q6mk2bNrF27VrWrl1HWVkp2Ox4k1vjaz8seNJ3hObq2VYZgpUWxYY/uSX+5Ja4A3nYy47jKfqSles2sGLFCuwOB/379WPo0KEMHjyY5s2bN3wNqkl47733ePbZZ+nf3MO321dd/gXAuDbVHK2w88orr5CUlMRtt90W4iojn9VbQt4HzBORGOAA8H2L6wm5wsJC1q9fz9q169iyZTNerxdxuPAkt8bXsR++lKwGvdK3jM2GP6U1/pTWuE0e9vJTOIoOsXHHbjZu3Mjvf/97unTtytAhQ8jLy6NTp066NWWUWLx4Mb///X/TI93LPT3LsF9hb44I3NW1gkqf8Nxzz+H3+7njjjv0c1MPlgaAMWYb8LV+qUhijGHfvn2sW7eOtWvXsm/fvuAPYpPwpHXCl9YWf2ImRHKfptjwJ2XiT8rEbQZiqyrGUXyI3Ufz2fvCC7zwwgs0a96CoUMGM3jwYPr27av3DSKQMYa//vWvvPzyy/RK93Jfz1KcV/mxtwn8qHs5IjB37lyOHz/O/fffj8Nh9bVs06R/ayFQXV3N1q1bWb9+PWvWrqWoZmOLQGIG3tb98aVmEYhLC+mwzbAlQiA+DU98Gp5WfRFvJfbiI5woPszCxUtYsGABMTEuBgzIIS8vj7y8PJo1a2Z11aqeioqK+O3TT7Nu/XpGtqrmu10qzk74uloOG/yf7uW0iPWzaMkS8g8f4hcP/5LMzMyGLToKaAA0kNOnT5/t2tm8ZTNejwexO4NdO+174E9p882LsUUp44zH16ILvhZdqA74sJedwFN8mLWbt7N27VoAunbtxpAhgxkyZAgdOnTQJn8Ts379ep568gkqykqZ0bmCcW2q633tYxO4uWMVrRL8/H3XTu7+/ve4f9YDjB07Vj8fV8GyYaB1EW7DQA8fPsyaNWtYvWYNu3ftCs6KjU3Ck5yFLzULf1Im2OxWl3lRcbvfAaCq20SLK7kEY7BVncFRnI+zJB9beQEAGddkMmL4MIYMGULPnj216R/GTp06xezZs1m2bBlZSQF+fG0pbRL9l33dxYaBfpOCKhuzdyWxr9jBoEEDuffen5CdnV2v2iNN2M0DqAurA8AYw8GDB1m5ciXLV6zgcM2ktEBCc7yp2fhSs5tM107YB8AFxFuJozgfR9FhnGXHMAE/SckpjBwxnOHDh3PddddpGISJ6upqXn/9dV6Z9zIBv5dJWZXc2K7qivv7rzYAAAIG3suP5V+HEvAEbEybNp3vfve7JCZaNFEyzGgA1EN+fj4fffQRH3z4IUePHAER/ImZeNPa4kvNtm42bj00tQA4j9+Lo+QIjqJDxJTkY/xeEhKTGDVyBGPGjKFPnz46UcgCbrebRYsW8eor8zh9poiBGW6+07GS5nFXt69vXQKgVolHmP9FPKuOx5KYmMAtt36HqVOnRn0QaABcpbKyMj744APeXbqUfTXb0vmTW+JNa48vrW2T789v0gFwroAPe8kxnEUHiSk+jPF7SW/WjPHjxjFx4kTtCmgEVVVVLFy4kNdefYWi4hK6pfr4dvsKuqX56nS8+gRArS/L7Lx1MJ5thTEkJsQz/eZbmDZtWtQuI6EBcAWMMXz66acsWLCAlStX4fN5MQnNcKd3xJfeARMTGbNYXYc/xlkYHI7qj29GID69QWYEW87vw1F8GOeZL3CUHAUToEfPnky56SZGjRqF0xkB8yvCyKlTp3j77bdZvGghZeUVdE/z8q12lXU+8ddqiACodbDUzsIv49lSGENcrIsbJkxk2rRptGnTpt7Hbko0AL6Bz+dj5cqVvPb66+zbuxdxuHCnd8DboguB+Mgbghi3+x0cZSfOfu9Lymz6LYELiLcSZ+F+XKf3QVUJqWnpTPv2VKZMmUJycj32SYhyxhh27tzJ/PnzWb16NcYE6N/czYTsajqn1O/EX6shA6DW4TI77+bH8fEpF/4ADBo0kOnTbyYnJycqRg1pAFyEMYZVq1bx/Ny5HMnPh7gUqjN64G3WCeyRe0MxGgLgLGOwlx7DdXIn9pKjxMXHc8fttzNt2jRdsfQqlJaW8uGHH7J40UIOHPySeGdws5axratpcZV9/JcTigCoVewWlh2NZdnxeErd0LpVSyZNvpHrr78+ouebaABc4NixYzz11FNs374d4lKpatUPX1rbJjGCp76iKgDOYas8g+voVhzFh0lv1owHf/5zBg0aZHVZYcsYw7Zt21i8eDGrVq3E6/XRPtnPiJZVDMl04wrRCOdQBkAtbwA2nophxbE49hQ7sNlsDB6cx6RJkxk4cCB2e3gO366rcFwN1DKrV6/mscd+g8cfoLrtYLwtuoTV8soqNALx6VR1Hou97CTm8DoefPBBpk+fzj333KOjhs6Rn5/P+++/zwfvv8eJk6eCV/vXVDGipZu2SZcfx98UOG0wJNPDkEwPxytsrDoey+pNa1mzZi3N0lIZM248119/PR07drS61JCKugBYvnw5jz76KL745lR2Hdkkh3Cq+vEnXUP5tTfiyt/E/PnzKS8v58EHH4yKvuBLKSkpYfny5by3dCm7du9GBHqkeZnSvZoBLTzERNYF8XlaJgS4tVMl0zpU8klhDGtPeHjzn2/wxhtv0KF9O8ZffwNjx46NyNVroyoAioqK+O3vfoc3vjmVXa6PjFU3Vd3YHLizczF2J0uXLmXw4MEMHz7c6qoaVe0Wo++//z4bN2zA5/eTlRjgO52qyLvGTZqr6XQPNwSHDQZkeBiQ4aHMI2w4FcPak18we/Zs5jz3HP369WPc+PEMGzYsYva1iKoAWL58OZUVFVT3GK8nfwUieFr3I6boSxYtWhQVARAIBNi+fTsffPABK5Yvp7KqirRYGNcq2K+fHSFdPPWVFGMY28bN2DZujlfYWHvSxfpdm3liyxZ+/9+/Y8jQYVx//fX079+/Sc9Ab7qV18GZ2lU5Y6NzMoi6CLHhj0mioKDQ6kpC6tSpUyxdupR3lizmxMlTxDqEnOZVDO7qoXuaF1v09n5dVsuEANM7VDGtfRX7ShysO+liw+plLFu2jPS0VG6YMJGJEyc2ybkFURUAvXv3BsBZuB9vRjeLq7GQ30NsbCyTJ09m8eLFlPs9VldkGakuxVl+kuvG3Gh1KQ3O5/Oxbt06lixezKZNmwgYQ490L1O7V9OvhSdko3gilQh0SfXRJdXHHZ0r2HHayarjHl579RVeeeUV+vbpw6TJkxk+fHiT2c8iqgKgf//+9L3uOrZv30AgJgF/apbVJVlCfB4m3zSZn/zkJxhjeGPRe1aXZAlxl5Ow/0Pi4+O45ZZbrC6nwVRXV7NkyRJee/UVCgpPkxYLk9tWMqKlu8HH7Ecrpw36t/DSv4WXIrew5riLlfu28ZvfbOd/nn2Gm2/9DlOmTAn7NYgsnwcgInZgM3DUGDP5m57bEPMAiouLmfXAAxz44gvcLfvgadUHbFGVg8R99i8SA5VMmjSJJUuWUG6Lp6rHt6wuq/EYg6P4MPGH1hLrtPHE44/Tt29fq6uqt4qKChYsWMAbr79GcUkpXVN9TMyupE+zptPF8/LeeFYfD149t03yk53oY0aXSourujIBA7uLHLyTH8+O004S4uOY+u1pTJ8+ndTUVEtrC9uJYCIyi+C2kMmNEQAQvEL6wx/+wHvvvQexyVS1ycGXGh2TwCB6J4IB2KqKiM3fhL3kCO3ad+CxR3/dJPtuL7RhwwaeevIJzhQV0zPdy01t678mjxUe35rM7uKvBmh0S/WGdEJYqBwstbPoUBybC1wkxMdx/6wHGDNmjGVDjcNyIpiItAEmAb8BZjXW+8bGxvLQQw8xbtw4/vSnZ8jfvwwTn051Zm986e10UlgEslUU4jq+HUfRIeLi4rn73nuZOnVqkx7BAVBZWclf/vIXFi1aRFZigPv6l9GxgdbkUXXXPtnPT3uVc6S8ihf3+HnsscdYs2YNs2bNCqu1qKz+9P8R+DlwyWE5IjITmAk0+NK+OTk5vPjiCyxbtox//ONl8g+sgKOJuJt3wduiC8YZGWN9o1bAh+PMl7gKdmMrP0V8QgLTv/tdpk2bRkpKitXV1Zvb7WbW/f/Onj17mZhdxbQOlVe9yboKrTaJfh7uV8ySQ7G8tXIF+/buYfZzc8JmWWrLAkBEJgOnjDFbRGTkpZ5njJkDzIFgF1BD1+FwOBg/fjxjx45l3bp1vPX222zdsgXXsW14U9rga94JX0pW2G7tqC5gDLaKQpyn9+MqOojxVtOqdWum3nUvEydOJCEhweoKG4Qxhj/84Q/s3rOX+3qWMSAjekdyhTubwI3tgqulPrUNHnvsUR5//ImwWG/IyhbAEOAmEZkIxALJIvKyMWaGFcXYbDaGDh3K0KFDyc/PZ9GiRbz3/vuU7F+GOGNxp7XDl94Bf+I1UXOvoCmR6lKcZw4QU3QQqSzC4XAybNhQJk2aRP/+/SNumYdNmzaxdOlSvtWuUk/+TUS3NB93dinnbxs28v777zNhwgSrS7IuAIwxDwEPAdS0AH5m1cn/QllZWdxzzz3MnDmTzZs3s3TpUtauW4f31G5wJeBJbYc3vT2BhBYaBhYSdxmOM18SU3QQW0VwIlePHj254Ya7GTlyZNg0s0Nhy5YtOGwwuW2V1aWoqzCqlZt/fZnIli1bojsAmgKHw0Fubi65ublUVlayfv16PvroIzZs2EDMyc+CYZDSFl96O/yJGXrzuBFIdQnOoi9xFh06e9Lv1KkzY8dOZ/To0WRkZFhcYePYv38/rRL8Eb1IWyQSgexED/v27rG6FCBMAsAYswJYYXEZ3yg+Pp4xY8YwZswYysvLWbduHStXrmLDhg34Tn2OxMThTsnGl9YWf1JLvWfQUIzBVlWEo+hLYooPI5XB5Ty6duvGqJHTGT58OK1atbK4yMbXrl07dnyyBbcfndHbhAQMHC6PoW+v8FhmOiwCoKlJTExk/PjxjB8/nsrKSjZs2MCqVatYt3497oI9iMOFJ6UNvrS2+JLbRPTuYiFhDLaKApxFh4gpPgTVpYgIvXr1ZsSIOxg6dCjXXHON1VVaKi8vjzfffJOVx2IZn1VtdTnqCm0piKHYDYMHD7a6FEADoN7i4+MZNWoUo0aNwu12s2XLFlavXs2q1Wuo2P8FYnfgSa4Jg9QssMdYXTKB+HRM5Wngq03hLWcC2MtO4ij6ElfJYYy7ArvdTv/+/Rk+fDhDhgwhLS3N6irDRr9+/Rg0cACvbt5Ex2Sfjv1vAo5X2pi7J4muXTozcuRIq8sBwmAm8NUI1abwoeDz+dixYwerVq1ixcqVFBcVITY7nuTW+NLbWx4GcbvfAbB2BnDtSf/MgeBJ31OF0+lk0KBcRowYTl5eXtivpWKl0tJSZv7bD6ksPsW/9yyhUwSEQKTMBL7Q0Qo7v/80Bbc9iefn/rXRW7BhORM4kjkcDvr160e/fv346U9/ys6dO1m5ciXLV6zgzIGVNWHQBl+zDsEwiJb1iIzBVn4K55kDuIoPYTyVxLhcDB6cx8iRIxk4cGDEbLYRasnJyTz929/x4M//gyc+gX+7tozca5r2kNAqn5y3Um2Vr+mH2o7TTv7n82RiE1J46oknw6r7MkrOOtay2Wz07t2b3r17c++99/LZZ5+xYsUKPlq2nOIvliOOGDypbfE26xi8gRyBQ0ulugRnYXByFtWlOJ1OBg8ezKhRo8jNzSU2NtbqEpuk7Oxs/jL7OR755cP8787P2FNczXc6VTTZG8OVPmHy5K9Wql215A2rS6ozXwD+dTCORYfjad+2HY8/+SSZmZlWl3UeDYBGZrPZ6NWrF7169eKee+5h27ZtfPjhhyxfsYLqwn0Qm4Q7vRPe5p2b/n7Ffi+OMwdxnd6HrewkIkK//v0ZP25cRG2rZ7XU1FT++/d/4Pnnn2f+/H+ys8jFzGtL6dwEu4TiHYbFixdjjGHJkiVc42g6XdTnOlJu57ldyRwqs3HDDTfw05/+NCw/73oPIEy43W7WrFnDknfe4ZOtWzHG4EvNwtOiG/6UNg3eKgjlPQBbVRHOU7txnfkC4/PQJiuLyZMmMW7cOJo1a9bg76e+8sknn/DE47+hoLCQ8W2qmNa+ktgmdJnX1O8BeAOw5FAcCw/Fk5iYxAP/8XOGDRtmdVl6DyDcuVyus/MMTpw4wTvvvMOChQsp2fcBxKVQndEdb/PO4XuvwBjspcdwnfgUe+kx7A4Ho0eNYsqUKfTo0SPilmIIV9dddx0vvPg35syZw8KFC9lcGMddnUvp29xrdWkRb0+xgxf3JHOsQhg9ejT33Xdf2I9c0xZAGPN6vaxatYo33vgne/bsRmLiqM7ojieje703tW+wFoAxOIoPEXt8O1JxmtS0dG6ePo1JkyZZvglGtPv000/53W+f5tDhfAa0cHNHl0rSXeG9I1hTbAGUeYU39sez8ngsmddkcP+sBxg0aJDVZZ1HWwBNkNPpZMyYMYwePZodO3Ywb948Nm7cSOypz6nK7BPc19jC5SfspceJPbIJW0UhrVq35s6fPMiYMWOIibF+roOCXr16MfevL/Daa6/xj7//nU83uJjWvoKxraux66ol9WYMrDnh4rUvEqnw2bj11pv53ve+R1xcnNWlXTENgCZAROjTpw99+vRh586dPP/882zf/jGuwr1UZQ3Cn9yycetxl+PK34iz6EtaZGRw908eZNy4cU1+c5VI5HQ6ufPOOxkzZgx/+uMfmLdxE6tPxHFn5zK6pja9m8Th4nCZnZf2JbKv2EHPHt2Z9cDP6NChg9VlXTX9P7aJ6dmzJ3/84x9Zs2YNzz77Z07teRdPRnfcbXJCv+SEMThOf0F8/sc4bHDn3Xdz66234nK5Qvu+qt5atWrFk089zapVq/jzs8/wm602hmS6ubVjBamuptMNbLUKr/DmwTg+OhpHclIS//Ef/4cJEyZgszXNJpUGQBMkIgwbNowBAwbw/PPP8+abb+IsP0FFpzEYV4iWQA74iT20DmfhPrr37MnDv/hFVC7C1pSJCCNGjGDgwIHMmzeP1197la2FsUzKruCGrCpdWfQb+AKw7Ggs/zqUQKVX+Na3vsXdd9/d5Jcc1wBowmJjY7nvvvsYOHAgv/r1r7HtWkR5l+sJxDfwUEu/l4R9H2ArO8Fdd93Fd7/73bDYzUjVTVxcHD/84Q+54YYbeG72bOavWcOyY/Hc3L6MvEwPNh2wdZYxsLXQyesHkjhRIVx3XV/uuedeOnfubHVpDUIDIAIMGjSI52bP5v5Zs2Dve5R3nUAgroGGnwV8JOx7H0dFAQ8/8ghjxoxpmOMqy7Vp04ZHH3uM7du387//82ee27WPd44EmNaunOuaeyNxQvpV+bzIwfwDiewvsdM2qw1P/PJecnNzI2pIc9PsuFJfk5WVxTN/+hPJCbEkfLEM/A2zJozr0MfYyk7yy1/+Uk/+EapPnz7B5SQeeYRAUmv++Gkyv96ays4zTprQKPEG80WJgye3pfDkJykUO1rwwAMP8NcX/0ZeXl5EnfxBAyCitGrVikd//WukuhTXka31Pp695CgxhXu54447GD16dANUqMKVzWZjzJgxvPT3f/Czn/2MUmcGT29L5sltKewriY6OgsNldv6wI4lfbUnhmC+Ne++9l3mvvMqNN94YsSPcLPutRCQL+DuQCQSAOcaYP1lVT6To3bs3N954IwsXLcKT2aPuN4WNIe7oZlq2asVdd93VsEWqsOVwOJg8eTLjxo1j8eLFvPyPv/PoFge9m3mZ3qGCdkl+q0tscMcrbLx5MJ6Np1wkxMfxgx/czrRp08Jy7Z6GZmULwAc8YIy5FsgF7hWR7hbWEzFmzJiBTQRnwd46H8NWUYBUnOaO22/XiV1RyOVyMW3aNF559TVmzpzJQXcq/7kplTmfJ1DsjoxukAqv8PLeeB7amMaOkmRmzJjBa6+/wZ133hkVJ3+wsAVgjDkOHK/5ukxEdgGtgc+tqilSZGRk0LNnL7Z/cQQP/et0DEfJEWw2GyNGjGjg6lRTEhcXx+23385NN93EvHnz+Ocbb7C5MJYpbSu4PqsaRxPsRA4YWHHMxfyDiVR6hck33sj3v//9sF+3JxTC4p9PRNoB1wEbLvKzmSKyWUQ2FxQUNHZpTVavXj2RyiII1G3tF3vFabKz2zb5cc6qYSQmJvKjH/2Iv730Etfl5PL6Fwn8aksaJyvD4hRyxYrdwlPbUvjbnkQ6dOvFc3PmMGvWrKg8+UMYBICIJAJvAv9ujJE+DAIAABexSURBVPnaqk/GmDnGmBxjTE6LFi0av8AmqmXLlmACiLeyTq+3+ypp1apxl5hQ4a9NmzY88eSTPProo5wJJPGfW9LYdKppdBF+XuTgkc3pHKiI48EHH+RPf3omYsbz15WlASAiToIn/3nGmLesrCXSnF2ewdTtpp0EfLpLl7qkYcOGMWfuXLI7dOHZnUmsPNYwy4FkJ/qIsweIswfoluolO7Fh1iv6pNDJ09tSSMlow19mP8eECRMibkhnXVgWABL82/8rsMsY83ur6ohUvtq9VC+xWmggPp1AfPqlDyC2r46h1EW0bNmSZ5/9MwMG5PDinkR2nK7fEuUAM7pU0jbJT9skP7/oV8qMLnVrwZ7ri1IH//tZCp07d2b2c3Oa5KJtoWJlC2AIcCcwWkS21fxp+O2polRZWRkAxn7x5rk7Oxd3du4lX++3xVBaGt7rsCvrxcTE8Ktf/ZoOHTrwP58lU+IJr6tqtx+e2ZlCevMWPPHkU1EzuudKWRYAxpg1xhgxxvQ2xvSt+fOOVfVEmsLCQsTugEsEwOUEnHGc0pvu6grEx8fzX///r/AYG28fDK8T7LuH4yiqhl/88hHS07+hxRulLL8JrELj5MmTmJjEOu8lbGISKSgooCntGKesk5WVxY033sSKY7Gcrg6P00qVT3gnP55hw4bRq1cvq8sJS+HxL6Ua3LFjx/A5E+r8+oArEa/Hw5kzZxqwKhXJbr31VgIGVh8Pj/0hPj4ZQ7UPbrvtNqtLCVsaABHq2PETBOqxN0Dta0+cONFQJakI17JlS3L692PViXgCYdBwXHk8jvbt2nLttddaXUrY0gCIQJWVlVSUl2FciXU+Ru1rNQDU1bhhwkQKq2BPsbWLpx2rsHGg1M4NEybqcM9voAEQgWpnTAdi6tEFVNN9pLOv1dUYOnQo8XGxlncDrT3hwibC2LFjLa0j3GkARKCioiIAjDOu7gexOxGbQ+8BqKsSGxvLmLHj2FQQS6XPmitvfwBWn4gnNzeXZs0aeHe8CPONASAinUVkgYjsFJFXRaR1YxWm6q6kpAQA46jHVZgIOGN1LoC6apMmTcLtD96EtcK2006K3TBx0iRL3r8puVwL4AVgMTAN2Ao8G/KKVL1VVFQAYOz1a4Ybu/PssZS6Ul27dqVD+3asOG7NnIAVx2JplpZKbu6lJzqqoMsFQJIx5nljzB5jzG+Bdo1Qk6ont9sd/MJWv43bA+Kgurq6ASpS0UREmHzjTXxZauNQWf0+g1eryC3sOBPDhEmTI3YXr4Z0uQCIFZHrRKSfiPQD4i74XoWh2jV8zCXWAbpSRmz4/ZG3A5QKvTFjxuCw21l3onFvBq8/6cIYuP766xv1fZuqy0XkceDchdpOnPO9AXSjWKXU16SkpDAoN5f1W9Zya6dKbI10P3j9yTi6du1CVlZW47xhE/eNAWCMGdVYhaiGY7cHm91iAtRnPo4QOHsspa7WqFGjWLt2LftLHHRJDf3KsqeqbBwqs/Hj0WNC/l6R4oo6yUTk2xd5uAT41BhzqmFLUvV1di+AQP26b2zGr/sBqzrLy8vD6bCzqSCmUQJgc0Hwszp8+PCQv1ekuNJO4h8Ac4E7av48D8wC1orInSGqTdVR7ZK34vfW6zji95KQUPfJZCq6JSQk0K9/Dp+cjqUx1hTcUuiiU6eOwd3w1BW50gAIANcaY6YZY6YB3QE3MAh4MFTFqbpJTk4GQHz1HMHjrdY9gVW9DBs2jFOVQn5FaLsSSzzC/mIHQ4cOC+n7RJorDYB2xpiT53x/CuhijDkD1O8yUzW42g2uxVtV94P4vRi/V9dQV/UyePBgRIQtBaHtStxaEINBu3+u1pUGwGoRWSwid4nIXcBCYJWIJADFoStP1UWLFi0AEE/dJ3GJp/K8YylVF+np6fTs0Z3NhaHdX3pzgYtWLTNp3759SN8n0lxpANwLvAj0Ba4DXgLuNcZU6Eih8JOYmEhcfDw2T1mdj2FzB5eA0P5UVV/DR4wkv8zGycrQLD1W4RU+L3IyYuQoXfnzKl3Rv4gJbgu1BlgGfAisMg2wVZSI3CAie0Rkv4j83/oeTwWJCFlZWdiqS+p8DFt1MAB0PLWqr6FDhwJfjdJpaNtOO/Gbr95HXbkrCgARuQXYCEwHbgE2iMj0+ryxiNiB/wEmELypfJuIdK/PMdVXOnbogLO67r1ztqoiEpOSSU1NbcCqVDRq2bIlnTt1Clk30OaCGJo3S9eNX+rgSttkDwMDjDF3GWO+CwwEHqnnew8E9htjDhhjPMBrwJR6HlPV6NSpE8ZTdbYv/2o5q07TtUvnBq5KRasRI0fyRYmdwgbeL7jKB5+eiWX4iJHYbLq6/dW60r8x2wUTvk5fxWsvpTWQf873R2oeO4+IzBSRzSKyWTcnuXJdu3YFwF5Rh78zvw+pPKNXVKrBjBoVvFW4oYGXiN5aEIPHbxg9WlelqYsrPYkvFZH3ROR7IvI9YAnwTj3f+2J3a752X8EYM8cYk2OMydERKVeuc+fO2B0O7GUnL//kC9grCsAYunfXHjnVMFq3bk33a7ux5mRcg04KW30ilsxrMujRo0fDHTSKXOlN4P8A5gC9gT7AHGNMfSeAHQHOvcPYBjhWz2OqGi6Xi2uvvRZH+dXv6WsvO4GI0KtXrxBUpqLVjTdN4Wi5jd0NtF/w0Qo7nxc5mXzjTTr6p46uuBvHGPOmMWaWMeZ+Y8zbDfDem4DOItJeRGKA7xCcX6AaSP9+/bBVngaf+6pe5yg7TseOnXQWsGpQo0ePJikxgaX59diq9Bzv58fidNiZpDt/1dnltoQsE5HSi/wpE5F67RVojPEBPwHeA3YBbxhjPqvPMdX5cnJywBgcpVfRsPJ7sJefYsCAnNAVpqKSy+Vi2vSb+aQwhsPl9Vsa4ky1jdUnYpkwcdLZme/q6n1jABhjkowxyRf5k2SMSa7vmxtj3jHGdDHGdDTG/Ka+x1Pnu/baa4lPSMBRcuSKX+MoOQYmwKBBg0JYmYpW06ZNIz4ulgUH69cKWHw4FiM2br/99gaqLDrpuKkI5nA4yB00iJjSI2ACV/aaknziExLo2bNniKtT0SgpKYnpN9/CpgJXnVsBZ9w2VhyL44YbJpCZmdnAFUYXDYAIN2TIEIynCnv5FQwHNQFiSo4wOC9P91NVIXPzzTcTHxfLwi/r1gp4t+bqf8aMGQ1cWfTRAIhwubm52B0OHEVfXva59rKTGG+VTqlXIZWUlMSNN01hc4HrqieGVfmEVcfjGDVqtK5T1QA0ACJcQkICOTk5xBQf4nIDsB1nDuKMidH+fxVyU6dOBYQVx65u0/i1J2Ko8sH06fVaiUbV0ACIAqNGjgR3ObZvmhVsArhKDjE4L4+4uIYZpqfUpWRmZtL3ur5sOHV1E8M2nIqlXdtsunXrFrrioogGQBQYOnQodocD55kDl3yOvewExlN1dsq+UqE2evQYTlYKR65wt7ASj7Cn2MEo3fS9wWgARIHExERyB+XiKvrykqOBHKcPEBsbR15eXuMWp6LWwIEDAfjsjPOKnr+rKPg87aJsOBoAUWLs2DEYT+XF1wYK+HEVH2L48GG4XFfXJ6tUXWVkZNC6VUt2FZ8fANmJPrITfV97/q4iJwnxcXTq1KmxSox4GgBRIi8vD5crFsdFuoHspUcxPreuqKgaXZ++17Gv1HXefYAZXSqZ0eXry5jvK42hZ69eOkS5AWkARInY2FiGDBmMq/gQBM7vBnKePkhCQmJw6QilGlH37t0p9xhOXGa7yEqfcLTcRo8eOkGxIWkARJFRo0ZhvNXYy45/9WDAT0xpPiNGDNcrK9Xoamec7y/95vsA+0scGNBlnxuYBkAUGTBgADExLhzFh88+Zi87jvF5GDZsmIWVqWiVnZ1NYkI8ey+zRPS+Egc2Ed2kqIFpAESR2NhYBgzIIaYk/+ykMEdxPjExLvr162dxdSoa2Ww2evTsxb6ybx58sLfESceOHYiPj2+kyqKDBkCUGTRoUHBSWHUJADFlx+jXr5+O/lGW6d27N8fKhTLPxTd18QXgQKmTXr37NHJlkU8DIMr0798fCHb9iLscqkrIyelvcVUqmtXuPLe35OLdQIfKHLj9waBQDUsDIMq0atWKtPR07GUnsZcH5wT06aNXVso63bp1w+l0sLf44jeC99QEg25R2vA0AKKMiNCzRw+cVaexVxTijImhffv2VpelolhMTAxdu3Rlb2nMRX++r9hJq5aZNGvWrJEri3yWBICI/FZEdovIDhF5W0RSragjWnXu3BmqSrCXn6RDhw46/FNZrlfv3nxZZsfjP/9xY2B/WQw9eurVfyhY1QL4AOhpjOkN7AUesqiOqNSuXTsA7BWFdOzQwdpilCLYDeQPQH7F+RcjZ9w2Stzo8M8QsSQAjDHv12wKD/Ax0MaKOqJVmzZf/XW3bt3awkqUCqpd3vlg6fkrg35ZFvy+a9eujV5TNAiHewB3A+9e6ociMlNENovI5oKCK9jWUF3WNddcc/brjIwMCytRKigjI4OE+DiOXNACyC8Pft9BW6ohEbIAEJEPRWTnRf5MOec5DwM+YN6ljmOMmWOMyTHG5LRo0SJU5UaVhIQEunbrRnJKil5ZqbAgIrRr356jFwTA0Qo7mddk6CZFIRKyu3/GmLHf9HMRuQuYDIwx5mr2BFIN4bnZs60uQanzZGVls+GLz8977GSVg6xubS2qKPJZNQroBuBB4CZjzNfXfVVKRZ1WrVpRVM15I4FOVTv0PlUIWXUP4M9AEvCBiGwTEb0cVSrKZWZmAnDaHTwtVfqESq85+7hqeJYMADfG6JY+Sqnz1A5OOF1tp2V8gDPVwSDQe3+hEw6jgJRS6uyItNoT/5malsC5o9ZUw9IAUEqFhdqlHmpP/LX/bd68uWU1RToNAKVUWIiJiSE1OYmi2gCotiEiGgAhpAGglAobzVu0OHvlX+yxkZKcpGtVhZAGgFIqbDRr3oISb/CEX+S26dV/iGkAKKXCRnp6OiWe4Gmp1GsnLV2XgA4lDQClVNhITU2lzBNcBrrMZyctLc3qkiKaBoBSKmykpKTgC4DbD2UeITk52eqSIpoGgFIqbMTHxwNQ4bPh9hkSEhIsriiyaQAopcJGbQCUeOS871VoaAAopcJGTExwX+BKn+2871VoaAAopcJG7Zj/an+wBeB0Oq0sJ+JpACilwobdHtwC0hc4/3sVGhoASqmwYbMFT0n+mgAQEQuriXwaAEqpsHE2AIyc970KDf3bVUqFjdoTvs+c/70KDf3bVUqFjdqbvnoTuHFYGgAi8jMRMSKiKz4ppc6e8Cu9GgCNwbIAEJEsYBxw2KoalFLhpXbiV1HNgnA6Ezi0rGwB/AH4OWAsrEEpFUZqA+B0zbaQOhM4tCwJABG5CThqjNluxfsrpcJTSkoKACcq7ed9r0IjZFvtiMiHQOZFfvQw8Atg/BUeZyYwEyA7O7vB6lNKhR+Hw0FSQjxnKioBDYBQC1kAGGPGXuxxEekFtAe210zyaANsFZGBxpgTFznOHGAOQE5OjnYXKRXh0tLSKKuoJCE+DpfLZXU5Ea3Ru4CMMZ8aYzKMMe2MMe2AI0C/i538lVLRJ61ZcFCgbgYTejoPQCkVVtLT02v+q9tBhlrIuoCuVE0rQCmlgK/6/VO1BRBy2gJQSoWVpKQkABITEy2uJPJpACilwkrtiV/XAQo9/RtWSoWV2NhYQJeCbgwaAEqpsKIn/sajAaCUCivG6HSfxqIBoJRSUUoDQCmlopQGgFIqrNTeA9CuoNDTAFBKhSW9GRx6GgBKqbCkLYDQ0wBQSoUlbQGEngaAUiosaQsg9DQAlFJhSVsAoacBoJQKS9oCCD0NAKVUWKk98WsLIPQ0AJRSKkppACilVJTSAFBKqShlWQCIyH0iskdEPhORp62qQymlopUlewKLyChgCtDbGOMWkQwr6lBKhR9dC6jxWNUC+DHwpDHGDWCMOWVRHUqpMKWjgELPqgDoAgwTkQ0islJEBlzqiSIyU0Q2i8jmgoKCRixRKWUlbQGEXsi6gETkQyDzIj96uOZ904BcYADwhoh0MBf5FzfGzAHmAOTk5OgnQqkooS2A0AtZABhjxl7qZyLyY+CtmhP+RhEJAM0BvcRXSgHaAmgMVnUB/QsYDSAiXYAYoNCiWpRSYUhbAKFnySgg4AXgBRHZCXiAuy7W/aOUij56Kmg8lgSAMcYDzLDivZVSSgXpTGCllIpSGgBKqbCiff+NRwNAKRWW9F5A6GkAKKXCkrYEQk8DQCkVlrQFEHoaAEqpsKQtgNDTAFBKhZUuXboAMHDgQIsriXxWTQRTSqmL6tatGwsWLCAlJcXqUiKetgCUUmFHT/6NQwNAKaWilAaAUkpFKQ0ApZSKUhoASikVpTQAlFIqSmkAKKVUlNIAUEqpKCVNab0NESkADlldRwRpjm7FqcKTfjYbVltjTIsLH2xSAaAalohsNsbkWF2HUhfSz2bj0C4gpZSKUhoASikVpTQAotscqwtQ6hL0s9kI9B6AUkpFKW0BKKVUlNIAUEqpKKUBEAFEZKqIGBHpdpnnfU9EWp3z/VwR6R76ClU0EhG/iGwTkc9EZLuIzBIRPeeEEf3HiAy3AWuA71zmed8DzgaAMeaHxpjPQ1iXim5Vxpi+xpgewDhgIvBfFtekzqEB0MSJSCIwBPgB5wSAiPxcRD6tufJ6UkSmAznAvJqrsjgRWSEiOSLyYxF5+pzXfk9Enq35eoaIbKx5zXMiYm/kX1FFAGPMKWAm8BMJsovIb0Vkk4jsEJEf1T73ws9uzWN9ReTjmue+LSJpItJRRLae87rOIrKl8X+7pksDoOn7FrDUGLMXOCMi/URkQs3jg4wxfYCnjTHzgc3AHTVXZVXnHGM+8O1zvr8VeF1Erq35eogxpi/gB+5ohN9JRSBjzAGC55wMghcsJcaYAcAA4N9EpP3FPrs1L/878KAxpjfwKfBfxpgvgBIR6VvznO8Df2u0XygC6KbwTd9twB9rvn6t5nsb8KIxphLAGHPmmw5gjCkQkQMikgvsA7oCa4F7gf7AJhEBiANOheKXUFFDav47Huhd0zIFSAE6A2O54LMrIilAqjFmZc1zXwL+WfP1XOD7IjKL4MXKwEb4HSKGBkATJiLNgNFATxExgB0wwJs1/70arwO3ALuBt40xRoJn/ZeMMQ81YNkqSolIB4KtyFMEg+A+Y8x7FzznBq7us/smwfsKy4AtxpjTDVRuVNAuoKZtOvB3Y0xbY0w7Y0wWcBA4A9wtIvEAIpJe8/wyIOkSx3qLYNP7NoJhAPARMF1EMmqPIyJtQ/OrqEgmIi2A2cCfTXD26XvAj0XEWfPzLiKSALzPBZ9dY0wJUCQiw2oOdyewEsAYU11zrL8ALzbm7xQJtAXQtN0GPHnBY28C1wILgc0i4gHeAX5BsH90tohUAXnnvsgYUyQinwPdjTEbax77XER+CbxfM3zPS7BbSJfkVlciTkS2AU7AB/wD+H3Nz+YC7YCtNS3NAuBbxpilNX36F3527yL42Y0HDhDs7681j+A9rPdD/ytFFl0KQinVpInIz4AUY8wjVtfS1GgLQCnVZInI20BHgvfC1FXSFoBSSkUpvQmslFJRSgNAKaWilAaAUkpFKQ0ApS5BRMov8/N2IrLzKo/5t3NmvyplKQ0ApZSKUhoASl2GiCSKyEcisrVmlcop5/zYISIv1axSOf+cGaz9RWSliGwRkfdEpKVF5St1SRoASl1eNTDVGNMPGAX8d83sVQgunDenZpXKUuCemuUNngWmG2P6Ay8Av7GgbqW+kU4EU+ryBHhcRIYDAaA1cE3Nz/KNMWtrvn4Z+CmwFOgJfFCTE3bgeKNWrNQV0ABQ6vLuAFoA/Y0xXhH5Eoit+dmFMykNwcD4zBiTh1JhTLuAlLq8FOBUzcl/FHDuiqjZIlJ7oq/dmnMP0KL2cRFxikiPRq1YqSugAaDU5c0DckRkM8HWwO5zfrYLuEtEdgDpwF+MMR6CS3U/JSLbgW3A4EauWanL0rWAlFIqSmkLQCmlopQGgFJKRSkNAKWUilIaAEopFaU0AJRSKkppACilVJTSAFBKqSj1/wB0T/+uTzVtlwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.violinplot(tmp_df[\"label\"],tmp_df[\"logP\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we will do the same comparison with the formal charges of the molecules.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x7f6a94f97a58>"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEGCAYAAAB7DNKzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXhc9X3v8fd3RrvkRbu8y5IdCJsJGHDA4cFQkkBZkjTpDbht4AbIRkhu0uaSpU3a3gaSXJIU6MV1KInp5SnchISlgIGAWRMCMiWsBtuyjOVFkhdkS9Y2M9/7x4yELEvHMtboaEaf1/PMozPn/GbmO/Y885lzfuf3O+buiIiIjCQSdgEiIjKxKShERCSQgkJERAIpKEREJJCCQkREAuWEXUA6VFRUeG1tbdhliIhkjLVr1+5098rhtmVlUNTW1tLQ0BB2GSIiGcPMNo+0TYeeREQkkIJCREQCKShERCSQgkJERAIpKEREJJCCQkREAikoREQkkIJCRDKWLpMwPhQUIpKR7rjjDq644oqwy5gUsnJktohkv5/97GdhlzBpaI9CREQCKShERCSQgkJERAIpKEREJFBoQWFmc8xsjZm9YWavmdlXhmljZnajmW0ws5fN7KQwahWRiSuRSIRdQtYL86ynGPB1d3/RzKYAa83sUXd/fVCb84CFqdtpwC2pvyIiQDIoIhEdHEmn0P513X27u7+YWt4HvAHMGtLsYuB2T3oOmG5mM8a5VBGZwOLxeNglZL0JEcNmVgt8APjDkE2zgC2D7jdzcJj0P8dVZtZgZg1tbW3pKFNEZFIKPSjMrAS4G/iqu+8dunmYhww7Zt/dV7r7YndfXFk57GVfRUTkPQg1KMwsl2RI3OHuvx6mSTMwZ9D92cC28ahNRDKD+ifSL8yzngz4N+ANd//xCM3uA/4qdfbTEqDd3bePW5EiMuElv0okncI86+kM4C+BV8zspdS6bwFzAdx9BfAgcD6wAdgPXB5CnSIygUWj0bBLyHqhBYW7P8PwfRCD2zjwpfGpSEQykfYo0k8H90REJJCCQkREAikoREQkkIJCREQCKShERCSQgkJERAIpKEREJJCCQkREAikoREQkkIJCREQCKShERCSQgkJERAIpKEREJJCCQkREAikoREQkkIJCREQCKShERCSQgkJERAIpKEREJJCCQkREAikoREQkUKhBYWa3mVmrmb06wvazzKzdzF5K3f5uvGsUEZnsckJ+/V8ANwO3B7R52t0vGJ9yRERkqFD3KNz9KWB3mDWIiEiwTOij+KCZ/dHMHjKzY0dqZGZXmVmDmTW0tbWNZ30iIlltogfFi8A8d18E3ATcM1JDd1/p7ovdfXFlZeW4FSgiku0mdFC4+15370gtPwjkmllFyGWJiEwqEzoozKzGzCy1fCrJeneFW5WIyOQS6llPZvYfwFlAhZk1A98FcgHcfQXwSeALZhYDuoBPu7uHVK6IyKQUalC4+yWH2H4zydNnRUQkJBP60JOIiIRPQSEiIoEUFCIiEkhBISIigRQUIiISSEEhIiKBFBQiIhJIQSEiIoEUFCIiEkhBISIigRQUIpLRNP1b+ikoRCSj9fb2hl1C1lNQiEhG0xUt009BISIZJ5FIDCw3NzeHWMnkoKAQkYzz5ptvDiyvXbs2xEomBwWFiGScxx57DDOomxLjySfWqJ8izRQUIpJRtm/fzr33/Ial1d18om4/rW07uffee8MuK6spKEQkY8TjcW644X9jHufP6rs4obyP48v6+MXPb1NfRRopKEQkY6xcuZKGhrUsX9BBWX6yQ/uyozqwWBff/ua1dHZ2hlxhdlJQiEhGeOCBB7jrrrs4Z1Y3y2b1DKyvLExw9bHtbGlu5u+/9z1isViIVWYnBYWITHjPPfccN9xwA8eX9bF84cF7DceUxrjsqA6ef+EFbrjhBo3WHmOhBoWZ3WZmrWb26gjbzcxuNLMNZvaymZ003jWKSLhaW1v5++99lznFMa4+bi85I3xrnTWzh4/V7uehhx7i/vvvH98is1zYexS/AD4asP08YGHqdhVwyzjUJCITyIoVtxDr7eGa49opzAlu+/H5Xby/NMbPVv4r7e3t41PgJBBqULj7U8DugCYXA7d70nPAdDObMT7ViUjYmpubefzxNZw3dz+VhYlDtjeDv1zYQUdHp06ZHUNh71Ecyixgy6D7zal1BzGzq8yswcwaNPeLSHbo6OgAYMHU0XdQzy6Jk59j7Nu3L11lTToTPShsmHXD9lK5+0p3X+zuiysrK9NcloiMh0gk+RXV3jv6r6rOPqMv4QOPlSM30f8lm4E5g+7PBraFVIuIjLP6+nrq6+Zzz+YSeuKje8yvNxXibpx77rnpLW4SmehBcR/wV6mzn5YA7e6+PeyiRGR8RKNRrvnKV9nVBXesL+ZQZ72+ujuXx7YWcsGFF7JgwYLxKXISCPv02P8Afg8cZWbNZvZZM/u8mX0+1eRBoBHYAPwM+GJIpYpISBYtWsSll17KE9sK+M/NBSO227wvyk2vTqO2tpbPfe5z41hh9jvEyWbp5e6XHGK7A18ap3JEZIK64ooraG1t5Ze//S2VhQmWVB84W+w7PcaPX5lOyfQyfvDDH1FcXBxSpdlpoh96EhEhEonwjW98g+OPO5Zb102laV90YFtvHH76yjS6PI/rrv8BOpll7CkoRCQj5OXl8Q//+L+YVlrGTa9NozfVuX13YxGNe6N869vfUb9EmigoRCRjlJaWcu03v0XbfuPR5gJa9kd4ZGsh5513HmeeeWbY5WWtUPsoREQO18knn8ySJadx/9o/sKUjh5zcPK644oqwy8pq2qMQkYxz0UUXs78PfteSz9KlH6K8vDzskrKagkJEMs5JJ707kfSSJUtCrGRyUFCISMYpKHh3PEVdXV2IlUwOCgoRyWhVVVVhl5D1FBQiktE0uC79DisozEz/IyIyoWiW2PQb1b+wmZ1uZq8Db6TuLzKz/5PWykREZEIYbRT/BPgIsAvA3f8IaHSLiMgkMOp9NnffMmTVKGeHFxGRTDbakdlbzOx0wM0sD7iG1GEoERHJbqPdo/g8yem+Z5G86tyJaPpvEZFJYVR7FO6+E1ie5lpERGQCGlVQmNmNw6xuBxrc/d6xLUlERCaS0R56KiB5uGl96nYCUAZ81sx+mqbaRERkAhhtZ/YC4Gx3jwGY2S3AI8C5wCtpqk1ERCaA0e5RzAIGj8ouBma6exzoGfOqRERkwhjtHsUPgZfM7AnASA62+35qSo/fpqk2ERGZAA65R2FmRvIw0+nAPanbUne/1d073f1v3uuLm9lHzexNM9tgZtcOs/0sM2s3s5dSt797r68lIiLvzSH3KNzdzewedz8ZGLMznMwsCvwLyX6OZuAFM7vP3V8f0vRpd79grF5XREQOz2j7KJ4zs1PG+LVPBTa4e6O79wJ3AheP8WuIiMgRGm1QLAN+b2YbzexlM3vFzF4+wteeBQyeP6o5tW6oD5rZH83sITM7dqQnM7OrzKzBzBra2tqOsDQREek32s7s89Lw2jbMOh9y/0Vgnrt3mNn5JPtHFg73ZO6+ElgJsHjx4qHPIyIi79Go9ijcfbO7bwa6SH6Z99+ORDMwZ9D92cC2Ia+71907UssPArlmVnGErysiIodhtBcuusjM1gObgCeBJuChI3ztF4CFZjY/NSPtp4H7hrxuTeqsK8zs1FS9u47wdUVE5DCM9tDTPwJLgN+6+wfMbBlwyZG8sLvHzOxq4GEgCtzm7q+Z2edT21cAnwS+YGYxknszn3Z3HVYSERlHow2KPnffZWYRM4u4+xoz+8GRvnjqcNKDQ9atGLR8M3Dzkb6OiIi8d6MNinfMrAR4CrjDzFqBWPrKEhGRiWK0p8deTPLQz/8AVgMbgQvTVZSIiEwco71wUeegu6vSVIuIiExAoz3r6RNmtj4179JeM9tnZnvTXZyIiITvcGaPvdDd30hnMSIiMvGMto+iRSEhIjI5Be5RmNknUosNZnYXySk0Bi5U5O6/TmNtIiIyARzq0FP/mU0O7Ac+PGibAwoKEQlVIpEgEhntwRF5LwKDwt0vBzCzVcBX3P2d1P1S4Ib0lyciEmzPnj2Ul5eHXUZWG20Mn9AfEgDuvgf4QHpKEhEJFou9O963sbExxEomh9EGRSS1FwGAmZUx+jOmRETG1LPPPjuwfP/994dYyeQw2qC4Afidmf2jmf0D8DuSp8yKiIyrdevWcfNNN1JZ6Jw3t4unn36KO++8E80Xmj6jHZl9u5k1AGeTvODQJ4a5trWISNokEglWr17NT37yY6blxPjKce3UFMXZ2RVhxYoVvPXWW3z961+nuLg47FKzjmVjCi9evNgbGhrCLkNEjlAikeCNN95gzZo1PLHmcXbu2s0xpTG+dOxepuQlv7vc4T83F/CrxmJyc3M5bckSli1bxpIlSygqKgr5HWQOM1vr7ouH3aagEJGJpLu7m/Xr1/PUU0/xxJrHadu5i5wIHF/Wy2lVPZxW1Ut0mIPmjXujPLMjn4a2Qt7pgfy8XE5b8kGWLVvGokWLKCsrG/83k0EUFCIy4cTjcbZv305jY+PAbeOG9WzbvgN3HwiHU6t6+EBFH0U5o/uuSji89U4Of2jNp2FnAe2pIcLTp06hbsFC6urqBm61tbUUFBSk8V1mDgWFiIQmFouxc+dOtm7dSmNjI5s2bWLjxg00NTXR09MLJDs+q4ud2UW9zC6OM6ckzrFlow+HkSQc1rfnsGlfDs0dUbZ05rK1M0pvPLndzJg5o4b6BQuZP38+9fX1zJ07l+rqagoLC4/wnWeWoKDQKa4ickS6u7tpaWk54LZjxw5aW1rYsWMbO3ftIZFIDLSfmm/MLurlrKpYKhRizCqOkx8d+9oiBkdNj3HU9HfHXSQcWrsibOnIYUtHlObOJt5au42nn3qKwbE0dUoJ1dU11MyYQXV19UG3adOmYWZjX/QEpKAQkRG5O+3t7bS1tbFjx44Dw2D7dlpbdvDO3n0HPCZiUFYA5fl9LMhPsGROnPKCBFWFcWaXxJmWF+5RjIhBTVGCmqJeTql6d31PHLZ2RtmxP8qu7gg7u7vZtXs3G3es5/kuoyd+4PPk5+dRXVVFzYyZB4VIVVUVFRUV5ORkx1dsdrwLETls7k5HRwetra0Dt7a2Ntra2mhtbaF1xw7adu6it6/vgMflRaG8wCnP7+PEkgQVFQnKC+JUFCQoL0hQmpcYtrN5osuPQt3UOHVT4wdtc4eOmKUCJMrO7gi7urvY2b2P1nVv8/pLUTp6DwzAiBmlpdOpqqqmqrqayspKqqqqqKqqGlguKysjGk3DrtQYU1CIZKn+zuKtW7cOCoBUKLRsp61tJ92pPoJ+EYPSAijLizErP84JNQnKChKU5b8bBlNynUlyxGWAGUzJdabkxqmdcnCQQHKPZGd3cm9kV3eE3T0Rdnd3sbulhbe25PJc98F7JZFIhPKyUqqqq5OBMihEqqurmTt37oToKwk1KMzso8A/A1HgVne/fsh2S20/n+TstZe5+4vjXqjIBJZIJGhpaWHTpk1s2rSJpqYmNjVu5O23txywN2DA9AIoy49RkxfnmKr+AEj+LStIMD0vQWSShcBYyY/CrOI4s4qHDxJ36IxZKkCSQZIMlP3s3rqD15tyeaYb+gY93Myorqqkrn4BtbW1zJ8/n9raWubOnUt+fv44vbMQg8LMosC/AOcCzcALZnbfkBHf5wELU7fTgFtSf0Umpb6+Pv7rv/5rIBAaGzeyuanpgD2DskKYXdjLOTXJPoHqwvhACORk4CGhkaxvz2HdnhyOLo2xcFrs0A8ImRmU5DoluXHmlowcJh19xq6eCG1dUbZ2Rtna2c3ml3fwh9//jnjq6FbEjJkzZzC/rn4gPBYtWpS2WXTD3KM4Fdjg7o0AZnYncDEwOCguBm735Dm8z5nZdDOb4e7b01XUTTfdxIYNG9L19JLBFixYwJe//OVQa7jrrru49dZbAZiWD7OK+jizMnnW0OyS5K/ZIz2lNBOsb8/hxnXVfOT8P+XGBx/gmqNbMiIsDsUMpuQ5U/KSh7hOGbQtloAd+5Ph0dwZZWvnJta/uJVnnnkad5hfO4+f/2JVWuoKMyhmAVsG3W/m4L2F4drMAg4KCjO7CrgKYO7cuWNaqMhE0T/u6UdL9lBdlDhE6+y1bk8OHzn/T/nil76MO6x78fasCIogORGYXZL8QTD4i7I7Bj9+eSo9aRwTF2ZQDHckdOg7HU2b5Er3lcBKSA64e69Fhf2LUSRI/zQUf/NcKRWFMKuod+C4+KziGDOL4hRMglNUji6NceODD+AOjzz0ANccnd0hAcnxHy1dEbZ25rCtf69ify47OiP0JeCkuvRdvCnMj1QzMGfQ/dnAtvfQRmTS+PCHP8y0adNobGykqamJpsaNPNrcTF/s3WPelUUws/DdAJlZFKe8IM7UPM+ajuqF02Jcc3QL6168nWuOzow+itHaHzN2d0cGQmFrKhC2pwKhX3VVBbXH1HN6bS21tbUsXjzsoOoxEdoUHmaWA7wFnANsBV4ALnX31wa1+VPgapJnPZ0G3Ojupx7quTWFh0wmsViMbdu2JYOj/9a4kS1DAiQaSQ6EK83to6wgQXl+/6mv8YGznybjqa/jqTsGu3qiA2c99Z8Btasnwp7eXHZ3R+iKHfidXFVZwfy6eubNmzfQcT1v3rwxnxl3Qk7h4e4xM7saeJjk6bG3uftrZvb51PYVwIMkQ2IDydNjLw+rXpGJKicnh7lz5zJ37lzOPPPMgfWxWIytW7fS3Nw8MJiufxzF5tYdrN2264AggeRx8LICKMvrGzhltiw/PnAKbUVBgqIchclw+hIMGkORHE+xpycZArt7ctjdE2V/38E/zMumT6Oyqoq66hqWpMZRVFZWMmPGDGprayfEVOmaFFBkkkokErzzzjsDAXLA35YWWlt2sHP3buLxAzvNC3OM8oI45fkxKgoSVKQG4iVHZmfXIa7BumLGzu5IalT2oBHaPVF29eTwTvfBj5k+dQqVVVVUVdccMDK7f2BdRUUFubm54/9mhjEh9yhEJFyRSISysjLKyso46qijhm2TSCTYs2cPra2tB0/8t30bG1ta6Ojcf8BjciNQXgjleb0D03r0h8ms4vjABYcmmt44bN8fpbUremAY9OSwqztK55C9gdycKFWVlVTXz+SomhpqamoG5nnqn+tpPAfFpZOCQkRGFIlEKC8vp7y8nPe///3Dtuns7ByYMXbopIEv79jGnu17D2g/vQDmFPUypyTOnOIYc0rizCiOkztOgwHdk4eItnRGB2aQ3bI/jx2dRmJQFhQVFlBdXc3MBTM4KRUC/beamhpKS0uJRLJoBGMABYWIHJHi4uKBCwENp6enh9bWVrZv386mTZuSFyjauIFHm5oG+kiiBjXFCeYU9zGnOMbckjjvmx6jcAyuR7Fpbw6bO6K83ZHDlo4cmvfn0jVo76Cmuor6RQs5J/UeZs+eTU1NDSUlJZNmGvFDUR+FiISiv7N948aNg65w9xYtrTuB5CGsE8p6Oa26hxPLe0c9PiThsHFvDn9oyeP5nYUDfQfFRYXU1y+grr6euro66uuT019MhM7iiUBXuBORjNHR0cH69et55plneGLN4+zavYe8KJxYnrxe9kkVw18z++19UZ7dkc/zOwvZ1QW5uTksWbKEs85axvHHH09lZaX2EAIoKEQkIyUSCV555RXWrFnDE2se5532vRxb1seXjt1HSW7yu8sdHni7gF9uLCYajXLKKaew7OyzOeOMMyguLg75HWQOBYWIZLxYLMbq1av555/+hNK8GF89rp2qwjj/tq6E37fkc9ZZZ/G1r32NqVOnhl1qRtLpsSKS8XJycrjggguYP38+f/udb/OTV+HUim6ea83nyiuv5NJLL9WhpTSZHOd2iUjWOPbYY7nmK1+lbb/xwNuFLF36IZYvX66QSCMFhYhknKVLlw4sX3DBBSFWMjkoKEQk4+TkvHvUfMGCBSFWMjkoKEQko02fPj3sErKegkJEMlo0Gg27hKynoBARkUAKChERCaSgEBGRQAoKEREJpKAQEZFACgoREQmkoBARkUAKChERCRTK7LFmVgbcBdQCTcCfu/ueYdo1AfuAOBAbaQpcERFJn7D2KK4FHnP3hcBjqfsjWebuJyokRETCEVZQXAysSi2vAj4WUh0iInIIYQVFtbtvB0j9rRqhnQOPmNlaM7sq6AnN7CozazCzhra2tjEuV0Rk8kpbH4WZ/RaoGWbTtw/jac5w921mVgU8ambr3P2p4Rq6+0pgJSQvhXrYBYuIyLDSFhTu/icjbTOzFjOb4e7bzWwG0DrCc2xL/W01s98ApwLDBoWIiKRHWIee7gM+k1r+DHDv0AZmVmxmU/qXgQ8Dr45bhSIiAoQXFNcD55rZeuDc1H3MbKaZPZhqUw08Y2Z/BJ4HHnD31aFUKyIyiYUyjsLddwHnDLN+G3B+arkRWDTOpYmIyBAamS0iIoEUFCIiEkhBISIigRQUIiISSEEhIiKBFBQiIhJIQSEiIoEUFCIiEkhBISIigRQUIiISSEEhIhnNXVcVSDcFhYhktK6urrBLyHoKChHJaLqiZfopKEQk48RisYHlt99+O8RKJgcFhYhknNdee21guaGhIcRKJgcFhYhknIcffpicCBxT2seTT6xRP0WaKShEJKNs2bKF1atXc/bMLv6sbj/vtO/l7rvvDrusrKagEJGMEYvF+NGPfkhuJMGFtV0snBbjAxW9/Pu/305TU1PY5WUtBYWIZIxbb72Vl19+hcvft49pecnxE5cd1UE+vfzd336H/fv3h1xhdlJQiEhGePzxx7nzzjs5e1Y3p9f0DqwvzXe+eEw7W5qbue6660gkEiFWmZ0UFCIy4b311lv84PrrWDg9xvKFnQdtP6Y0xqfrO3n66adZtWpVCBVmt1CCwsw+ZWavmVnCzBYHtPuomb1pZhvM7NrxrFFEJoZ4PM513/8niiN9XHPcXnJH+Nb66JxuzqjpZtWqVaxfv358i8xyYe1RvAp8AnhqpAZmFgX+BTgPOAa4xMyOGZ/yRGSiePTRR9nUtJlL6t/tlxiOGfzFwv0U58LKlf86jhVmv1CCwt3fcPc3D9HsVGCDuze6ey9wJ3Bx+qsTkYnk4YdXM7M4wSlVvYdsW5zrfGT2fl54oYHdu3ePQ3WTw0Tuo5gFbBl0vzm1blhmdpWZNZhZg+Z+EckeiYQzJTdOxEbXfmpesjNbs8qOnbQFhZn91sxeHeY22r2C4T4WI/7Pu/tKd1/s7osrKyvfW9EiMuGUlJTQ0pVLb3x07d/uyCFiRmFhYXoLm0Ry0vXE7v4nR/gUzcCcQfdnA9uO8DlFJMN86lOf4tlnn2X1lkIuqg2eqmN7Z4QntxVwwYUXUlRUNE4VZr+JfOjpBWChmc03szzg08B9IdckIuPsxBNP5ENLl3Lv5mIa90ZHbNcbhxVvTCW/oJDLL798HCvMfmGdHvtxM2sGPgg8YGYPp9bPNLMHAdw9BlwNPAy8Afw/d39tpOcUkez19b/+a8orKvnpq9PZ03PwUWl3uHVdCU37onzr29+mtLQ0hCqzV1hnPf3G3We7e767V7v7R1Lrt7n7+YPaPeju73P3enf/pzBqFZHwTZ8+ne9fdz3dns/Nr00lNmTw9cPNBTzXks9nP3sFS5cuDafILDaRDz2JiAyoq6vjb77xDda/k8MvG9/tf9jQnsNdG4o544zTWb58eYgVZi8FhYhkjHPOOYeLLrqIh94uZPO+KAmH296cQkVlJdde+03MRnkOrRwWBYWIZJQrr7ySkuIi/mNDMfdsKqS5I8JVn/s8U6ZMCbu0rKWgEJGMMmXKFC65dDmv78nlnqYi6uvms2zZsrDLymppG0chIpIul1xyCUuXLiWRSFBdXU0kot+86aSgEJGME4lEmDdvXthlTBqKYRERCaSgEBGRQAoKEREJpKAQEZFACgoREQmkoBARkUAKChERCWTZeLlAM2sDNoddR5aoAHaGXYTICPT5HDvz3H3Yy4NmZVDI2DGzBndfHHYdIsPR53N86NCTiIgEUlCIiEggBYUcysqwCxAJoM/nOFAfhYiIBNIehYiIBFJQiIhIIAXFJGJmHzczN7OjD9HuMjObOej+rWZ2TPorlMnIzOJm9pKZvWZmfzSzr5mZvpsmEP1nTC6XAM8Anz5Eu8uAgaBw9yvc/fU01iWTW5e7n+juxwLnAucD3w25JhlEQTFJmFkJcAbwWQYFhZl9w8xeSf2Su97MPgksBu5I/corNLMnzGyxmX3BzH446LGXmdlNqeW/MLPnU4/5VzOLjvNblCzg7q3AVcDVlhQ1sx+Z2Qtm9rKZfa6/7dDPbmrdiWb2XKrtb8ys1MzqzezFQY9baGZrx//dZS4FxeTxMWC1u78F7Dazk8zsvNT609x9EfBDd/8V0AAsT/3K6xr0HL8CPjHo/n8D7jKz96eWz3D3E4E4sHwc3pNkIXdvJPndVEXyh027u58CnAJcaWbzh/vsph5+O/A/3f0E4BXgu+6+EWg3sxNTbS4HfjFubygL6JrZk8clwE9Ty3em7keAn7v7fgB33x30BO7eZmaNZrYEWA8cBTwLfAk4GXjBzAAKgdZ0vAmZNCz198PACak9XYBpwELgTxjy2TWzacB0d38y1XYV8MvU8q3A5Wb2NZI/ak4dh/eQNRQUk4CZlQNnA8eZmQNRwIG7U38Px13AnwPrgN+4u1syHVa5+zfHsGyZpMysjuReaSvJwPiyuz88pM1HObzP7t0k+z0eB9a6+64xKndS0KGnyeGTwO3uPs/da919DrAJ2A38dzMrAjCzslT7fcCUEZ7r1yR3+S8hGRoAjwGfNLOq/ucxs3npeSuSzcysElgB3OzJ0cAPA18ws9zU9veZWTHwCEM+u+7eDuwxsw+lnu4vgScB3L079Vy3AD8fz/eUDbRHMTlcAlw/ZN3dwPuB+4AGM+sFHgS+RfL47Qoz6wI+OPhB7r7HzF4HjnH351PrXjez7wCPpE5r7CN5OEpTvctoFJrZS0AuEAP+HfhxatutQC3wYmrPtQ34mLuvTvU5DP3sfhp2b6AAAAGcSURBVIbkZ7cIaCTZH9HvDpJ9bI+k/y1lF03hISKTgpn9NTDN3f827FoyjfYoRCTrmdlvgHqSfXVymLRHISIigdSZLSIigRQUIiISSEEhIiKBFBQiR8jMOg6xvdbMXj3M5/zFoNHIIqFSUIiISCAFhcgYMbMSM3vMzF5MzWp68aDNOWa2KjWr6a8GjSg+2cyeNLO1Zvawmc0IqXyRESkoRMZON/Bxdz8JWAbckBpNDMkJFFemZjXdC3wxNS3FTcAn3f1k4Dbgn0KoWySQBtyJjB0Dvm9mZwIJYBZQndq2xd2fTS3/X+AaYDVwHPBoKk+iwPZxrVhkFBQUImNnOVAJnOzufWbWBBSktg0d2eokg+U1d/8gIhOYDj2JjJ1pQGsqJJYBg2fQnWtm/YHQf0naN4HK/vVmlmtmx45rxSKjoKAQGTt3AIvNrIHk3sW6QdveAD5jZi8DZcAt7t5Lcgr4H5jZH4GXgNPHuWaRQ9JcTyIiEkh7FCIiEkhBISIigRQUIiISSEEhIiKBFBQiIhJIQSEiIoEUFCIiEuj/A08SZ24p15hMAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.violinplot(tmp_df[\"label\"],tmp_df[\"charge\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, we see a signficant difference.  All of the active molecules are neutral, while some of the decoys \n",
    "are charged. Let see what fraction of the decoy molecules are charged. We can do this by creating a new dataframe\n",
    "with just the charged molecules."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "charged = decoy_df[decoy_df[\"charge\"] != 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A pandas dataframe has a property, shape, that returns the number of rows and columns in the dataframe. As such,\n",
    "element[0] in the shape property will be the number of rows.  Let's divide the number of rows in our dataframe of \n",
    "charged molecules by the total number of rows in the decoy dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.16175824175824177"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "charged.shape[0]/decoy_df.shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The fact that 16% of the decoy compounds are charged, while none of the active compounds are is a concern.  An examination of both sets indicate that charge states were assigned to the decoys, but not to the active molecules. In order to be consistent, we will use some code from the RDKit Cookbook to neutralize the molecules. First, we will import an RDKit function to neutralize charges."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "from neutralize import NeutraliseCharges"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we will create a new dataframe with the SMILES, ID, and label for the decoys.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "revised_decoy_df = decoy_df[[\"SMILES\",\"ID\",\"label\"]].copy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With this new dataframe in hand, we can replace the SMILES with the SMILES for the neutral form of the molecule. The\n",
    "NeutraliseCharges function returns two values.  The first is the SMILES for the neutral form of the molecule and the second is a boolean variable indicating whether the molecule was changed.  In the code below, we only need the SMILES, so we will use the first element of the tuple returned by NeutraliseCharges."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "revised_decoy_df[\"SMILES\"] = [NeutraliseCharges(x)[0] for x in revised_decoy_df[\"SMILES\"]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once we've replaced the SMILES, we can add a molecule column to our new dataframe and calculated properties again. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "PandasTools.AddMoleculeColumnToFrame(revised_decoy_df,\"SMILES\",\"Mol\")\n",
    "add_property_columns_to_df(revised_decoy_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now append the dataframe with the active molecules to the one with the revised, neutral decoys and calculate\n",
    "another box plot. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "new_tmp_df = active_df.append(revised_decoy_df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.axes._subplots.AxesSubplot at 0x7f6a9649dc18>"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZAAAAEGCAYAAABLgMOSAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAaOUlEQVR4nO3de5CldX3n8fenb8NFC0EHHG7C6mwMpJRgCzHsajBigKwZpUyEdd3RTXZiIpqYvbGb3SRVW0kRU5qLMbCjS8SsCdmoxKkw4SKbhI0pVxoKuUqYEJRxJswAE1BAunv6u3+cp4dD091z5pk+fWam36+qU+d5fpdzfmfm1Pn083tuqSokSdpXQ4MegCTp4GSASJJaMUAkSa0YIJKkVgwQSVIrI4MewHJ62cteVqeccsqghyFJB5Xbbrvt0apaPbd8RQXIKaecwsTExKCHIUkHlSTfmK/cKSxJUisGiCSpFQNEktSKASJJasUAkSS1YoBIkloxQCRJrQw0QJJclWRHkrsXqE+S30myJcmdSc7sqjs/yf1N3WXLN2pJBzpvU7E8Br0F8mng/EXqLwDWNo8NwBUASYaBTzT1pwGXJDmtryOVdFD41re+xUUXXcQ999wz6KEc8gYaIFV1C/D4Ik3WAZ+pjq8AL0myBjgL2FJVD1bVJHBN01bSCjcxMcGuXbu4/vrrBz2UQ96gt0D25gTg4a71rU3ZQuUvkGRDkokkEzt37uzbQCVppTnQAyTzlNUi5S8srNpYVeNVNb569QuuBSZJaulAv5jiVuCkrvUTgW3A2ALlkla4pPP3pTvS++9A3wLZBPzr5misHwCeqKrtwK3A2iSnJhkDLm7aShLwXJCofwa6BZLkj4AfAl6WZCvwy8AoQFVdCWwGLgS2AE8D72vqppNcCtwADANXVZWHXEjawy2Q/htogFTVJXupL+ADC9RtphMwkrSHWx7L50CfwpIkHaAMEElSKwaIJKkVA0SS1IoBIklqxQCRJLVigEiSWjFAJEmtGCCSpFYMEElSKwaIJKkVA0SS1IoBIklqxQCRJLVigEiSWhlogCQ5P8n9SbYkuWye+v+Q5I7mcXeS3UmOaeoeSnJXUzex/KOXpJVtYDeUSjIMfAI4j869z29Nsqmq7p1tU1W/AfxG0/5twIer6vGulzm3qh5dxmFLkhqD3AI5C9hSVQ9W1SRwDbBukfaXAH+0LCOTJO3VIAPkBODhrvWtTdkLJDkCOB/4fFdxATcmuS3JhoXeJMmGJBNJJnbu3LkEw5YkwWADZL4bF9cCbd8GfHnO9NU5VXUmcAHwgSRvnK9jVW2sqvGqGl+9evX+jViStMcgA2QrcFLX+onAtgXaXsyc6auq2tY87wCupTMlJklaJoMMkFuBtUlOTTJGJyQ2zW2U5CjgTcAXu8qOTPLi2WXgrcDdyzJqSRIwwKOwqmo6yaXADcAwcFVV3ZPk/U39lU3TdwA3VtVTXd2PA65NAp3P8IdVdf3yjV6SNLAAAaiqzcDmOWVXzln/NPDpOWUPAq/t8/AkSYvwTHRJUisGiCSpFQNEktSKASJJasUAkSS1YoBIkloxQCRJrRggkqRWDBBJUisGiCSpFQNEktSKASJJasUAkSS1YoBIkloxQCRJrQw0QJKcn+T+JFuSXDZP/Q8leSLJHc3jl3rtK0nqr4HdUCrJMPAJ4Dw690e/Ncmmqrp3TtP/W1X/omVfSVKfDHIL5CxgS1U9WFWTwDXAumXoK0laAoMMkBOAh7vWtzZlc70hydeS/HmS0/exrySpTwZ5T/TMU1Zz1m8HXlFV30lyIfCnwNoe+3beJNkAbAA4+eST249WkvQ8g9wC2Qqc1LV+IrCtu0FVPVlV32mWNwOjSV7WS9+u19hYVeNVNb569eqlHL8krWiDDJBbgbVJTk0yBlwMbOpukOTlSdIsn0VnvI/10leS1F8Dm8KqqukklwI3AMPAVVV1T5L3N/VXAu8EfibJNPAMcHFVFTBv34F8EElaoQa5D2R2WmrznLIru5Z/F/jdXvtKkpaPZ6JLkloxQCRJrRggkqRWDBBJUisGiCSpFQNEktSKASJJasUAkSS1YoBIkloxQCRJrRggkqRWDBBJUisGiCSpFQNEktSKASJJasUAkSS1MtAASXJ+kvuTbEly2Tz1705yZ/P4mySv7ap7KMldSe5IMrG8I5ckDeyOhEmGgU8A5wFbgVuTbKqqe7ua/T3wpqraleQCYCNwdlf9uVX16LINWpK0xyC3QM4CtlTVg1U1CVwDrOtuUFV/U1W7mtWvACcu8xglSQsYZICcADzctb61KVvITwJ/3rVewI1JbkuyYaFOSTYkmUgysXPnzv0asCTpOQObwgIyT1nN2zA5l06A/LOu4nOqaluSY4Gbkny9qm55wQtWbaQz9cX4+Pi8ry9J2neD3ALZCpzUtX4isG1uoySvAT4FrKuqx2bLq2pb87wDuJbOlJgkaZkMMkBuBdYmOTXJGHAxsKm7QZKTgS8A76mqv+0qPzLJi2eXgbcCdy/byCVJg5vCqqrpJJcCNwDDwFVVdU+S9zf1VwK/BLwU+L0kANNVNQ4cB1zblI0Af1hV1w/gY0jSijXIfSBU1WZg85yyK7uWfwr4qXn6PQi8dm65JGn5eCa6JKkVA0SS1IoBIklqZZ8CpDniSZKk3gIkyQ8muRe4r1l/bZLf6+vIJEkHtF63QH4T+BHgMYCq+hrwxn4NSpJ04Ot5CquqHp5TtHuJxyJJOoj0eh7Iw0l+EKjmrPEP0UxnSZJWpl63QN4PfIDO1XK3Amc065KkFaqnLZDmpk3v7vNYJEkHkZ4CJMnvzFP8BDBRVV9c2iFJkg4GvU5hHUZn2uqB5vEa4BjgJ5P8Vp/GJkk6gPW6E/1VwJurahogyRXAjXTuZ35Xn8YmSTqA9boFcgLQfRb6kcDxVbUbeHbJRyVJLU1OTgIwNTU14JEc+nrdAvkIcEeSv6RzK9o3Ar/WXNrkS30amyQtamZmhqmpKSYnJ/c8HnroIQAef/xxtm/fztjYGGNjY4yOjjI2NsbQkJcAXCqpWvw24enctelEYJrObWMDfHX2lrL79ebJ+cBv07mh1Keq6vJ53vu3gQuBp4H3VtXtvfSdz/j4eE1MTOzvsCUBu3fvZnJy8nk/4HN/zLsfi7VbtN/ks0w++2xnfWqKqdnnqSmmpvf9fOaR4WHGxkYZHRl5Llya57FVqxgbW7WnvDt45i7Pt95Lu9HRUUZGBnorpn2W5LbmZn7Ps9dPUVWV5E+r6nXAkh1xlWQY+ASd/ShbgVuTbKqqe7uaXQCsbR5nA1cAZ/fYV1pRduzYwa5du5bsR32y64d7aras+eGenJpi9+6Z/R7zcGB0OIwOw+hQMToEoylGhorRzDA6VBw2VLxotm60GF31XNuRVGd5GMaGipHAo98NT00N8eKxGV56WDE1A1Mz6XrO88qmZ8LU0zD1nc76t2eGmK4wVUPPaz+9GyZniiX42AwNDTE2OtIEy2y4rGJ0bIxVY6sYW7VqycLqqKOOYs2aNfs/6Hn0GoNfSfL6qrp1Cd/7LGBLc3dBklwDrAO6Q2Ad8JnqbCZ9JclLkqwBTumh75L7+Mc/zpYtW/r5FjpIvepVr+KDH/zgwN7/0Ucf5V3vehd7m1HoNjLU/HgPdf14DxUjmdnz433kELxktm5VMXL4bNt6Yb+h+etGmuexodqzPFs3lKX9d3jgiRH+5OHj+JELf5QbNl/Hh179CGuPml7S95gpmH5BKD0XTtMzMNkdTosFV3fZMzD1VKf/0zNNgFWYmhna02Z6BqZ2d0JxX3zyk59k7dq1S/rvAL0HyLnATyf5BvAUnWmsqqrX7Md7nwB0X19rK52tjL21OaHHvgAk2QBsADj55JP3Y7jSgeupp56iqjjn5d/l1S+Z5vCR4oiR4oiRGQ4bLsa6fshnf+yX+sf7QPD1XSP8yIU/ys9+4INUwddv/8ySB8hQYGwYxoZnw7r30F4qVTBdc8JrNzyzOzwzHZ6eHuLp6fDN7wxz09bDefLJJ/syjl4D5II+vPd8X9+5/xMLtemlb6ewaiOwETr7QPZlgHMN8i9MaTFjY2MAfPkfDuPL//DC+s4UUbO1MVyMBkaGZhhNMTr7PDy7BdJsIQw/N6U022+2bux5Wxxzn19Yt1yB9eqjp/mdzddRBTf++XV86NVLGx4LqeK5rYSCqd3ND3tlz4/7dD3/x35PXbM1Mtm9lTHPFkvntWen1YaeC5DdMDVTTO1eOMpWrVrVl8/d66VMvgGQ5Fg6JxUuha3ASV3rJwJzd8wv1Gash77SirFmzRo+9rGP8dhjj7XbB/Lss3x7avK5ndXfndyzr2NqcoqZfZgaW8h8U2YjewJsZs5U1wtD6YVTZfPX/fjJO3l04g9418m7CXDfrpF5p5nm/YGed1qq68d7zw9411RUiyml+SRhbHSEsdHRzk792X0ZzY79w+bZ/9HrPpDTTjtt/wc4j14vZfJjwEeB44EdwCvoXI339P1471uBtUlOBb4FXAz8yzltNgGXNvs4zgaeqKrtSXb20FdaUc4888y+vfb09HRPR1C132Hf2Wn/9LOTPDH1LJPPTnYeS7zTfiGzO7Wf+wEeZezwVXt2bh+5atWSHoU1X5/h4WE6B54ePHqdwvrvwA8AX6qq709yLnDJ/rxxVU0nuRS4gc6huFdV1T1J3t/UXwlspnMI7xY6h/G+b7G++zMeSQsbGRlhZGSEI444YmBj2L17d09hdd1113HLLbdw2mmnsX79+r3+qB+Mh9UeKHr9V5uqqseSDCUZqqq/SPLr+/vmVbWZTkh0l13ZtVwscNn4+fpKOnQNDw8zPDzMYYctPov+zW9+k1tuuYXjjz+es8+e99gaLZFeA+Qfk7wIuAX4bJIddE4slKQDymzA7C1otP96Pad/HfAM8GHgeuDvgLf1a1CSpANfr0dhPdW1enWfxiJJOoj0tAWS5KIkDyR5IsmTSb6dpD9npkiSDgr7cjXet1XVff0cjCTp4NHrPpBHDA9JUrdFt0CSXNQsTiT5Y+BP6bqBVFV9oY9jkyQdwPY2hTV7pFXROZHvrV11BRggkrRCLRogVfU+gCRXAz9XVf/YrB9N59ImkqQVqtd9IK+ZDQ+AqtoFfH9/hiRJOhj0GiBDzVYHAEmOofcjuCRJh6BeQ+CjwN8k+RydfR8/Afxq30YlSTrg9Xom+meSTABvpnMzp4u8/7gkrWw9T0M1gWFoSJKA3veBSJL0PAMJkCTHJLmpub7WTd076LvanJTkL5Lcl+SeJD/XVfcrSb6V5I7mceHyfgJJ0qC2QC4Dbq6qtcDNzfpc08C/q6rvpXM3xA8k6b6x729W1RnNwxtLSdIyG1SArOO5y8JfDbx9boOq2l5VtzfL36ZzD/YTlm2EkqRFDSpAjquq7dAJCuDYxRonOYXOiYv/r6v40iR3Jrlqvimwrr4bkkwkmdi5c+f+j1ySBPQxQJJ8Kcnd8zzW7ePrvAj4PPDzVTV7D5IrgFcCZwDbWeSyKlW1sarGq2p89erVLT+NJGmuvp1NXlVvWaguySNJ1lTV9iRrgB0LtBulEx6f7b7yb1U90tXmk8CfLd3IJUm9GNQU1iZgfbO8Hvji3AZJAvxP4L6q+ticujVdq+8A7u7TOCVJCxhUgFwOnJfkAeC8Zp0kxyeZPaLqHOA9wJvnOVz3I0nuSnIncC7w4WUevySteAO5IGJVPQb88Dzl24ALm+W/pnPZlPn6v6evA5Qk7ZVnokuSWjFAJEmtGCCSpFYMEElSKwaIJKkVA0SS1IoBIklqxQCRJLVigEiSWjFAJEmtGCCSpFYMEElSKwaIJKkVA0SS1IoBIklqZSABkuSYJDcleaB5PnqBdg81N466I8nEvvaXJPXPoLZALgNurqq1wM3N+kLOraozqmq8ZX9JUh8MKkDWAVc3y1cDb1/m/pKk/TSoADmuqrYDNM/HLtCugBuT3JZkQ4v+kqQ+6ds90ZN8CXj5PFW/uA8vc05VbUtyLHBTkq9X1S37OI4NwAaAk08+eV+6SpIW0bcAqaq3LFSX5JEka6pqe5I1wI4FXmNb87wjybXAWcAtQE/9m74bgY0A4+Pj1f4TSZK6DWoKaxOwvlleD3xxboMkRyZ58ewy8Fbg7l77S5L6a1ABcjlwXpIHgPOadZIcn2Rz0+Y44K+TfA34KnBdVV2/WH9J0vLp2xTWYqrqMeCH5ynfBlzYLD8IvHZf+kuSlo9nokuSWjFAJEmtGCCSpFYMEElSKwaIJKkVA0SS1IoBIklqxQCRJLVigEiSWjFAJEmtGCCSpFYMEElSKwaIJKkVA0SS1IoBIklqxQCRJLUykABJckySm5I80DwfPU+b70lyR9fjySQ/39T9SpJvddVduPyfQpJWtkFtgVwG3FxVa4Gbm/Xnqar7q+qMqjoDeB3wNHBtV5PfnK2vqs1z+0uS+mtQAbIOuLpZvhp4+17a/zDwd1X1jb6OSpLUs0EFyHFVtR2geT52L+0vBv5oTtmlSe5MctV8U2CzkmxIMpFkYufOnfs3aknSHn0LkCRfSnL3PI91+/g6Y8CPAX/SVXwF8ErgDGA78NGF+lfVxqoar6rx1atXt/gkkqT5jPTrhavqLQvVJXkkyZqq2p5kDbBjkZe6ALi9qh7peu09y0k+CfzZUoxZktS7QU1hbQLWN8vrgS8u0vYS5kxfNaEz6x3A3Us6OknSXg0qQC4HzkvyAHBes06S45PsOaIqyRFN/Rfm9P9IkruS3AmcC3x4eYYtSZrVtymsxVTVY3SOrJpbvg24sGv9aeCl87R7T18HKEnaK89ElyS1YoBIkloxQCRJrRggkqRWDBBJUisGiCSpFQNEktSKASJJasUAkSS1YoBIkloxQCRJrRggkqRWDBBJUisGiCSpFQNE0iGlqgY9hBVjIAGS5MeT3JNkJsn4Iu3OT3J/ki1JLusqPybJTUkeaJ6PXp6RSzrQzQZIkgGP5NA3qC2Qu4GLgFsWapBkGPgEnXuinwZckuS0pvoy4OaqWgvc3KxL0p4AmZmZGfBIDn0DCZCquq+q7t9Ls7OALVX1YFVNAtcA65q6dcDVzfLVwNv7M1JJBxunsJbPgbwP5ATg4a71rU0ZwHFVtR2geT52oRdJsiHJRJKJnTt39m2wkg4M09PTgx7CitG3e6In+RLw8nmqfrGqvtjLS8xTts9/WlTVRmAjwPj4uH+aSIe40dHRQQ9hxehbgFTVW/bzJbYCJ3Wtnwhsa5YfSbKmqrYnWQPs2M/3knSIWLt2LQCvf/3rBzySQ9+BPIV1K7A2yalJxoCLgU1N3SZgfbO8Huhli0bSCnD66adz7bXX8qY3vWnQQznkDeow3nck2Qq8AbguyQ1N+fFJNgNU1TRwKXADcB/wv6vqnuYlLgfOS/IAcF6zLkkAHH300R7Guwyyko5YGB8fr4mJiUEPQ5IOKkluq6oXnLN3IE9hSZIOYAaIJKkVA0SS1IoBIklqxQCRJLVigEiSWllRh/Em2Ql8Y9DjOIS8DHh00IOQ5uF3c2m9oqpWzy1cUQGipZVkYr5jw6VB87u5PJzCkiS1YoBIkloxQLQ/Ng56ANIC/G4uA/eBSJJacQtEktSKASJJasUA0ez9WSrJq/fS7r1Jju9a/1SS0/o/Qq1ESXYnuSPJPUm+luQXkvibdQDxP0MAlwB/Teeuj4t5L7AnQKrqp6rq3j6OSyvbM1V1RlWdTufGcRcCvzzgMamLAbLCJXkRcA7wk3QFSJL/mOSu5i+/y5O8ExgHPtv8VXh4kr9MMp7kZ5J8pKvve5N8vFn+V0m+2vT5H0mGl/kj6hBQVTuADcCl6RhO8htJbk1yZ5Kfnm0797vblJ2R5CtN22uTHJ3klUlu7+q3Nslty//pDl4GiN4OXF9Vfws8nuTMJBc05WdX1WuBj1TV54AJ4N3NX4XPdL3G54CLutbfBfxxku9tls+pqjOA3cC7l+Ez6RBUVQ/S+c06ls4fPE9U1euB1wP/Nsmp8313m+6fAf5TVb0GuAv45ar6O+CJJGc0bd4HfHrZPtAhYGTQA9DAXQL8VrN8TbM+BPx+VT0NUFWPL/YCVbUzyYNJfgB4APge4MvAB4DXAbc296c+HNjRjw+hFWP2RudvBV7TbBkDHAWsBd7CnO9ukqOAl1TVXzVtrwb+pFn+FPC+JL9A54+ds5bhMxwyDJAVLMlLgTcD35ekgGGggM83z/vij4GfAL4OXFtVlU5qXF1V/3kJh60VKsk/obMVu4NOkHywqm6Y0+Z89u27+3k6+1X+D3BbVT22RMNdEZzCWtneCXymql5RVadU1UnA3wOPA/8myREASY5p2n8bePECr/UFOlMHl9AJE4CbgXcmOXb2dZK8oj8fRYeyJKuBK4Hfrc7ZzzcAP5NktKn/p0mOBG5kzne3qp4AdiX5583LvQf4K4Cq+m7zWlcAv7+cn+lQ4BbIynYJcPmcss8D3wtsAiaSTAKbgf9CZ374yiTPAG/o7lRVu5LcC5xWVV9tyu5N8l+BG5vDL6foTGt5SX314vAkdwCjwDTwB8DHmrpPAacAtzdbujuBt1fV9c0+jbnf3fV0vrtHAA/S2d8x67N09uHd2P+PdGjxUiaSVrQk/x44qqr+26DHcrBxC0TSipXkWuCVdPYFah+5BSJJasWd6JKkVgwQSVIrBogkqRUDROqTJN/ZS/0pSe7ex9f8dNfZ19JAGSCSpFYMEKnPkrwoyc1Jbm+uEruuq3okydXNVWI/13UG9euS/FWS25LckGTNgIYvLcgAkfrvu8A7qupM4Fzgo83Z09C58OTG5iqxTwI/21ye4+PAO6vqdcBVwK8OYNzSojyRUOq/AL+W5I3ADHACcFxT93BVfblZ/l/Ah4Drge8DbmpyZhjYvqwjlnpggEj9925gNfC6qppK8hBwWFM390zeohM491TVG5AOYE5hSf13FLCjCY9zge4rEp+cZDYoZm8tfD+werY8yWiS05d1xFIPDBCp/z4LjCeZoLM18vWuuvuA9UnuBI4BrqiqSTqX2v/1JF8D7gB+cJnHLO2V18KSJLXiFogkqRUDRJLUigEiSWrFAJEktWKASJJaMUAkSa0YIJKkVv4/C3f4VYcAx7gAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.violinplot(new_tmp_df[\"label\"],new_tmp_df[\"charge\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An examination of the plot about show that there are very few charged molecules in the decoy set.  We can use the same \n",
    "technique we used above to create a dataframe with only the charged molecules.  We can then use this dataframe to determine the number of charged molecules remaining in the set. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.0026373626373626374"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "charged = revised_decoy_df[revised_decoy_df[\"charge\"] != 0]\n",
    "charged.shape[0]/revised_decoy_df.shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have now reduced the fraction of charged compounds from 16% to 0.3%.  We can now be confident that our active and decoy sets are reasonbly well balanced.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to use these datasets with DeepChem we need to write the molecules out as a csv file consisting of SMILES, Name, and an integer value indicating whether the compounds are active (labeled as 1) or inactive (labeled as 0)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>SMILES</th>\n",
       "      <th>ID</th>\n",
       "      <th>is_active</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Cn1ccnc1Sc2ccc(cc2Cl)Nc3c4cc(c(cc4ncc3C#N)OCCCN5CCOCC5)OC</td>\n",
       "      <td>168691</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>C[C@@]12[C@@H]([C@@H](CC(O1)n3c4ccccc4c5c3c6n2c7ccccc7c6c8c5C(=O)NC8)NC)OC</td>\n",
       "      <td>86358</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3)Cl)Nc4cccc5c4OC(O5)(F)F</td>\n",
       "      <td>575087</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3)Cl)Nc4cccc5c4OCO5</td>\n",
       "      <td>575065</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3)Cl)Nc4cccc5c4CCC5</td>\n",
       "      <td>575047</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                              SMILES      ID  is_active\n",
       "0  Cn1ccnc1Sc2ccc(cc2Cl)Nc3c4cc(c(cc4ncc3C#N)OCCC...  168691          1\n",
       "1  C[C@@]12[C@@H]([C@@H](CC(O1)n3c4ccccc4c5c3c6n2...   86358          1\n",
       "2  Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3...  575087          1\n",
       "3  Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3...  575065          1\n",
       "4  Cc1cnc(nc1c2cc([nH]c2)C(=O)N[C@H](CO)c3cccc(c3...  575047          1"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "active_df[\"is_active\"] = [1] * active_df.shape[0]\n",
    "revised_decoy_df[\"is_active\"] = [0] * revised_decoy_df.shape[0]\n",
    "combined_df = active_df.append(revised_decoy_df)[[\"SMILES\",\"ID\",\"is_active\"]]\n",
    "combined_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our final step in this section is to save our new combined_df as a csv file.  The index=False option causes Pandas to not include the row number in the first column. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "combined_df.to_csv(\"dude_erk1_mk01.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
